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iBSTSACT 

This  thesis  examines  the  network  management  functions 
required  for  a  local  computer  network.  Initially,  general 
management  considerations  are  addressed.  Thsse  include: 
problem  determination,  performance  analysis,  problem  manage- 
ment, change  management,  configuration  management,  and 
operations  management.  Ti=  si&estrsam,  mainstream,  central- 
ized, decentralized,  and  hyoril  network  monitoring 
technologies  are  then  iiscussed.  An  investigation  of 
network,  measurement  tools  and  their  'use  i.i  generating 
management  reports  is  undertaken.  The  topics  of  analysis 
timing,  performance  measure  utilization,  and  parameter 
selection  are  considered.  Procedures  for  detesting,  diag- 
nosing and  correcting  network  component  failures  are 
presented.  Solutions  are  propose!  for  problems  associated 
with  managing  a  local  computer  network- long  haul  network 
interface.  Finally,  a  discussion  of  the  mission,  objec- 
tives, and  responsibilities  of  a  local  computer  network 
central  monitoring  site  is  undertaken. 


TABLE    OF    CONTENTS 


I.  INTRODUCTION 9 

A.  ASSUMPTIONS 10 

1.  Network  Topology  and  Transmission  Medium  .  11 

2.  Network  Layer  Protoool  11 

3.  End  to  End  Protocol 12 

B.  LAN  ARCHITECTURE 12 

1.  Local  Network  Logical  view  ........  12 

2.  Local  Network  PtiysioaL  View 14 

C.  NETWORK  MANAGEMENT  DISCIPLINES 14 

1.  Problem  Determination   16 

2.  Performance  Analysis 16 

3.  Problem  Management 16 

4.  Change  Management 17 

5.  Configuration  Management 17 

6.  Operations  Management 13 

II.  DESIGN  ISSUES  IN  NETWORK  MONITORING   19 

A.   NETWORK  MONITORING  A ETHO DO LOGIES  19 

1.  Hardware  Methodology  20 

2.  Software  Methodology 21 

3.  Hybrid  Metaodolngy 23 

3.   NETWORK  MONITORING  TECHNOLOGIES   23 

1.  Sidestream  Monitoring   25 

2.  Mainstream  Monitoring 26 

3.  Centralized  Monitorinj  27 

4.  Decentralized  Monitoring  29 

5.  Hybrid  Monitoring 30 

C.   CHAPTER  SUMMARY 32 

III.  NETWORK  MEASUREMENT  TOOLS,  AND  MEASUREMENTS  AND 
STATISTICS 33 


A.  NETWORK  MEASUREMENT  TOOLS   33 

1.  Cumulative  Statistics 34 

2.  Trace  Statistics ...  34 

3.  Snapshot  Statistics 35 

4.  Artificial  Traffic  Saierators 35 

5.  Emulation 36 

6.  Network  Measurement/Control  Center  ....  37 

B.  MEASUREMENTS  AND  STATISTICS   37 

1.  Host  Communication  Matrix 39 

2.  Group  Com  mn  ication  Matrix 41 

3.  Packet  Type  Histogram 40 

4.  Data  Packet  Size  Histogram 41 

5.  Throughput-Jtilizatioi  Distribution   ...  41 

6.  Packet  Intacarrival  rime  Histogram  ....  42 

7.  Channel  Acquisition  Delay  Historgam   ...  42 

8.  Communication  Delay  Histogram   43 

9.  Collision  Count  Histogram 44 

10.  Transmission  Count  Histogram   45 

C.  CHAPTER  SUMMARY 45 

IV.      NETWORK  PERFORMANCE  ANALYSIS  AND  COMPONENT  FAILURE  49 

A.  PERFORMANCE  ANALYSIS  TISISS   50 

1.  Off -Line  Analysis 50 

2.  On-Line  Analysis 51 

3.  Instantaneous  Analysis  52 

B.  LAN  PERFORMANCE  ANALYSIS 53 

1.  Performance  Measure  Utilization   53 

2.  Performance  Parameter  Selection   55 

C.  COMPONENT  FAILURE 56 

1.  Failure  Detection 57 

2.  Failure  Diagnosis 61 

3.  Failure  Notification 63 

D.  CHAPTER  SUMMARY 64 


V.  MANAGING  THE  LAN/DDN  INTERFACE  68 

A.  GATEWAY  CONFIGURATION   69 

B.  PACKET  SIZING 72 

C.  CONGESTION  CONTROL  73 

1.  LAN   to  LHN   Packst   Control 74 

2.  LHN   to  LAN   Packst   Control 76 

D.  ADDRESSING    AND    NAMING 77 

E.  ACCESS   CONTROL 79 

F.  OTHER    CONSIDERATIONS    80 

G.  CHAPTER    SUMMARY 82 

VI.  LAN    CENTRAL    MONITORING    SITE 84 

A.  MISSION    OF    A    LAS    MOTITORIN3    SITE 84 

B.  MANNING    AND    ORGANIZATION    OF    A    LAN    CMS       ....  86 

C.  A    NETWORK    OPERATOR'S    WORKBENCH    88 

D.  OPERATORS    ACTIONS:    NORMAL    CONDITIONS    89 

1.  Initialization 93 

2.  Utility   Data    Bases 90 

3.  Operator's   Displays 91 

4.  Normal   Management    Activities    91 

E.  OPERATORS    ACTIONS:    COMPONENT    FAILURE    92 

F.  CHAPTER    SUMMART 94 

APPENDIX    A:       PROELEM    MANAGEMENT    RS3D33    ENTRIES       95 

APPENDIX    3:       CON  FIGURATION    MANAGEMENT    RECORD    ENTRIES       .     .  96 

LIST    OF    REFERENCES 97 

INITIAL    DISTRIBUTION    LIST 101 


LIST  OP  PIGOSE3 

1.1  Local  Network  Logical  View ....13 

1.2  Local  Network  Physical  Viaw 15 

2.1  Hardware  Monitoring  Devica:  Logical  View  ....  20 

2.2  Hybrid  Monitoring  Device:  Logical  View  24 

2.3  Centralized  Monitoring 29 

2.4  Decentralized  Monitoring  29 

2.5  Hybrid  Monitoring   31 


One  of  the  major  objectives  of  any  local  network  is  to 
provide  reliable  communications  facilities,  reflected  both 
in  the  continued  availabiLity  of  the  network  itself  and  in 
the  lowest  possible  error  rate  as  seen  by  individual 
processes  [Ref.  14:  p.  7  13].  To  this  we  would  add  the 
requirements  of  high  capacity  and  minimal  end-to-end  delay 
experienced  by  the  usee.  We  now  submit  what  we  feel  is  a 
responsible  and  complete  definition  of  network  management. 
Our  definition  includes:  collection  of  measurements  and 
subsequent  statistics  ganeration,  hardware  and  software 
failure  detection,  diagnosis  anl  correction,  network 
performance  analysis,  and  network  parameter  adjustment. 

One  school  of  thought  advocates  Management  of  local  area 
computer  networks,  while  another  fesls  that  management,  as 
we  have  defined  it,  is  not  require!.  We  support  the  fcrmer 
of  the  two.  The  benefits  to  be  gained  from  the  management 
of  a  local  computer  network  are  numerous.  We  are  able  to 
reduce  the  impact  of  failures  ani  increase  network  avail- 
ability by  detecting,  diagnosing,  and  correcting  hardware 
and  software  problems  very  quickly.  Control  and  monitoring 
technologies  allow  network  operators  to  anticipate  problems. 
Rather  than  reacting,  operators  are  able  to  analyze  problems 
and  take  appropriate  action  to  minimize  them,  or  even 
preclude  their  occurence.  Management  of  a  local  computer 
network  gives  us  the  ability  to  provide  for  capacity  plan- 
ning, manage  the  growth  of  the  network,  control  costs,  and 
eliminate  redundant  or  unused  capacity.  We  can  also  improve 
the  networks  performance  and  its  availability  to  users  by 
monitoring  the  network  components  aid  through  avaluation  of 
the  netwerk  as  a  whole. 


It  is  the  author's  intent  to  identify  and  discuss  the 
tools  required  by  network  management  for  the  attainment  of 
these  and  other  benefits.  We  vrilL  begin  by  describing  a 
SPLICE  local  area  computer  network,  followed  by  a  discussion 
of  six  network  managenent  disciplines.  Chapter  2  will 
present  various  network  lonitoring  nethodologies  and  tech- 
nologies. In  Chapter  3,  we  enter  into  a  discussion  of  the 
measurement  tools  available  to  the  Dperator  and  suggest  ten 
management  reports  to  be  generate!  from  collected  data. 
Chapter  U  provides  information  on  ai=Lysis  timing,  network 
performance  measure  utilization  and  parameter  selection,  and 
on  component  failure  detection,  diagiosis,  and  notification. 
Chapter  5  identifies  and  suggests  solutions  for  the  problems 
associated  with  managing  the  LAN/DOS  interface.  In  Chapter 
6,  we  conclude  with  a  discussion  of  the  mission,  objectives, 
and  responsibilities  of  a  LAN  oentral  monitoring  site. 

A.   ASSUMPTIONS 

To  productively  discuss  the  topir  of  network  management, 
it  is  important  that  a  common  base  of  understanding 
concerning  the  SPLICE  (Stock  Point  Logistios  Integrated 
Communications  Environment)  local  area  computer  network  be 
established.  This  section  briefly  describes  the  Network 
Layer  Protocol  proposed  foe  tha  SPLICE  LAN.  This  discussion 
will  include;  error  detection,  packet  acknowledgement, 
collision  detection,  access  control,  bus  control,  retran- 
smission technique,  and  Dacket  format.  Additionally,  the 
network  topology  and  physical  transnissicn  medium  will  be 
identified.  Final ly,  a  brief  description  of  the  proposed  End 
to  End  Protocol  will  be  discussed.  \  uore  detailed  explana- 
tion of  the  SPLICE  concept  can  be  fomd  in  [Ref.  1  ]. 


13 


1-  Network  To  go  logy,  aid  Transmission  Median 

The  Ring  ,  Star,  Unstructured,  and  Global  3us  topolo- 
gies  were  discussed  in  *Ref.  2 ]• ■  Primary  considerations 
made  during  the  selection  of  a  tocology  were  it's  flexi- 
bility, reliability  and  simplicity,  Understanding  that  the 
structure  of  the  network  must  lend  itself  to  change  and 
reconfiguration,  one  author  [Ref.  2:  p. 21]  recommended  that 
a  global  bus  topology  be  adopted  for  the  SPLICE  local 
computer  network. 

Although  a  number  of  transmission  mediums  were 
discussed  in  [Ref.  2],  no  particular  technology  was  recom- 
mended for  all  SPLICE  network  configurations.  For  -his 
discussion  of  network  management,  it  will  be  assumed  that 
the  transmission  medium  is  coaxial  cible  and  that  a  baseband 
technology  is  being  utilized. 

2-  Network    Lay_er   Protocol 

Decentralized  control  of  the  bus  is  the  premis  upon 
which  all  subsequent  characteristics  are  based.  Nodes 
access  the  network  utilizing  a  random  access  contention 
mechanism  with  collision  detection  (C5J1A/CD).  Error  detec- 
tion is  accomplished  through  the  use  of  a  cyclic  redundancy 
checksum.  The  acknowledgement  for  a  packet  successfully 
received  is  undertaken  by  either  sending  a  special  acknowl- 
edgement packet  or  by  including  the  acknowledgement  with  a 
data  packet  bound  for  the  appropriate  node.  Upon  detection 
of  a  collision,  a  node  implements  an  adaptive  binary  expo- 
nential backoff  retransmission  tecanique.  Finally  there 
exists  a  single  packet  format  for  both  data  and  control 
information,  the  specific  type  being  identified  in  the 
packet  type  field  [Ref.  2:  p.  53]. 
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3.   End  to  End  Protocol. 

TCP  was  utilized  as  a  basis  from  which  to  develop 
the  transport  protocol.  Justification  for  the  use  of  TCP 
can  be  found  in  [Ref.  3].  A  major  consideration  during  the 
design  of  an  end  to  end  protocol  was  the  assumption  that 
SPLICE  LAN'S  would  be  collected  to  each  other  through  the 
Defense  Data  Network.  rh=  fact  that  the  end  to  end  protocol 
currently  planned  for  the  D DN  is  PIP  further  excentuates  the 
benefits  to  be  derive!  3y  having  a  TCP  basad  transport 
protocol.  Investigation  shows  that  if  TCP  is  used  in  the 
strictest  sense  without  any  modification  as  the  local  trans- 
port control  protocol,  simple  intarnetwork  communication 
will  be  achieved  at  the  expense  of  suboptimal  intranetwork 
performance  [Ref-  2:  p.  73]. 

B.   LAN  ARCHITECTURE 

This  section  depicts  and  briefly  describes  the  logical 
and  physical  views  of  th=  SPLICE  LAN.  These  diagrams  are 
included  in  order  to  provide  a  visual  representation  which 
may  be  referred  to  during  the  discussion  of  network  manage- 
ment throughout  the  thesis. 

1.   Local  Network  Log^ioal  £iew 

The  six  boxes  along  the  top  of  figure  1.1  are  iden- 
tified as  operation  fuiotions  implemented  in  software 
modules.  The  three  boxes  to  the  right  and  ii  the  middle 
represent  support  functions  implemented  in  software  modules. 
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CoQtrol  messages  flow  along  the  logical  control  bus  while 
data  messages  flow  along  the  logical  data  has.  A  more 
detailed  explanation  of  thesa  functional  modules  can  be 
found  in  [Ref  •  4  ]• 

2-   Local  Network  Physical  £iew 

There  exists  only  one  physical  bus  upon  which  will 
flow  both  control  and  data  messages.  The  functions  identi- 
fiad  in  the  logical  view  of  the  network  have  been  assigned 
to  specific  minicomputers.  As  can  ba  seen  in  figure  1.2  , 
the  network  management  function  has  not  been  identified. 
Theoretically,  this  function  could  reside  in  one  or  all  of 
the  network  nodes.  An  iniapth  discussion  of  this  topic  will 
be  undertaken  in  Chapter  2. 

C.   NETWORK  HANAGEMEHT  DISCIPLINES 

If  viewed  as  a  singla  module,  the  network  management 
function  appears  quite  complex.  Different  aspects  of  the 
function  appear  to  overlap,  while  others  appear  to  be 
disjoint  and  unrelated.  In  order  to  more  effectively 
analyze  the  various  aspects  of  the  network  management  func- 
tion, a  disag  greagation  of  the  function  into  unique, 
identifiable  modules  is  mdertakea.  Freeman  proposes  six 
distinct  management  disciplines  associated  with  managing  the 
components  of  a  computer  natwork  "R  =  f.  5:  p.  91]-  These 
disciplines  include;  proolem  determination,  performance 
analysis,  problem  management,  changa  management:,  configura- 
tion management,  and  operations  management.  The  purpose  for 
presenting  these  disciplines  is  twofold;  First,  to  create 
more  managable  and  understable  moiiles  through  which  the 
concept  of  network  management  can  be  discussed,  and  second, 
to  provide  a  foundation  upon  which  various  network  manage- 
ment techniques  can  be  analyze!  throughout  the  thesis.   Sach 
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of   these   disciplines   will  ba   briefly   described   in  the 
following  sections. 

1-   Problem  Determination 

Problem  determination  is  tka  process  of  identifying 
a  failing  or  down  component  of  the  network  so  that  correc- 
tive action  may  be  taken.  It  includes;  awareness  that  a 
problem  exists,  isolatioi  of  the  problem  to  a  particular 
element,  identification  of  wtxat  caused  the  problem,  and 
determination  of  the  correct  organization,  individual,  or 
vendor  who  is  responsible  for  the  correction  of  that  specfic 
type  of  problem. 

2.   Performance  Analysis 

Peformanca  Analysis  deals  with  guantifiably 
answering  the  question  of,  •How  weLl  is  the  network,  doing 
what  it  is  supposed  to  io?».  It  provides  for  the  measure- 
ment of  certain  dependent  variables  throughout  the  network. 
These  measurements  are  then  compare!  to  criteria  that  have 
been  previously  established  by  some  ether  means  (e.g.  by 
mathematical  models) .  3y  observing  the  variance  between 
these  figures,  a  snapshot  of  the  network's  performance  can 
be  obtained  for  that  particular  instant  in  time.  A  number 
of  variables  measured  can  be  classified  as  "tuning"  statis- 
tics. Certain  parameters  exist  which  can  be  adjusted  by 
network  operations  personnel  in  order  to  effect  the  values 
of  these  tuning  statistics.  In  this  way,  we  can  affect  both 
network  performance  and  -he  quality  of  the  service  provided 
by  the  network  as  perceive!  by  the  user. 

3-   Problem  i5an.a,2ejij§rii 

Problem  Management  concerns  the  reporting,  tracking, 
ani  resolution  of  problems  that  affect  a  user's  or  process' 
capability  to   communicate  with  any   other  user   or  process. 
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Establishment  and  maintenance  of  a  problem  database  can  be 
accomplished  in  a  number  of  ways.  Problems  may  be  docu- 
mented manually  utilizing  pencil  anl  paper.  They  may  be 
recorded  semi-automatically  through  manual  entry  into  a 
database.  0rr  problems  may  be  recorded  automatically 
through  the  interaction  of  the  problem  management  module 
with  the  problem  determination  and  performance  analysis 
modules.  The  method  chosen  through  which  network  problems 
will  be  recorded  should  provide  for  lata  consistency,  real 
tine  information  or  nearly  so,  user  accessibility,  and 
minimal  operations  personnel  envolvement.  A  list  of 
possible  entries  for  inclusion  in  a  problem  record  is 
provided  in  Appendix  A. 

*•   Change  Management 

Changes  made  to  a  network  component  that  are  not 
promulgated  throughout,  or  made  available  to  the  network  may 
lead  to  substantial  ielay  when  communicating  with  that 
element  or  even  make  that  element  inaccessible.  Change 
management  precludes  these  eveats  from  occuring  by 
reporting,  tracking,  obtaining  approval  for,  and  verifying 
the  implementation  of  cnaages  in  network  components  [  Ref .  5: 
p.  91].  Pencil  and  paper,  or  manual  entry  into  a  database 
are  two  methods  by  which  change  management  may  be 
accomplished. 

5.   Configuration  Management 

Configuration  management  provides  for  the  creation 
of  a  database  which  contains  the  past,  present  ,  and  future 
physical  and  logical  characteristics  of  all  network  elements 
[Ref.  5:  p.  91].  Include!  in  this  wculd  be  the  SPLICE  mini- 
computers, host  computers,  shared  resources,  the  subnetwork, 
anl  pertinent  information  concerning  any  connected  networks. 
The  configuration   management  database  should   be  accessible 
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by  both  software  modules  and  users  as  needed.  Updating  and 
maintenance  of  this  database  could  be  accomplished  in  ths 
sane  manner  as  the  problen  management  database.  It  is  this 
researcher's  opinion  that  configuration  management  could 
most  efficiently  be  accomplished  utilizing  automated  techni- 
ques which  are  based  on  the  interaction  of  the  various 
network  management  modules.  1  list  of  entries  that  may  be 
included  in  a  configuration  management  record  of  a  network 
conponent  is  included  in  \ppeniix  B. 

6-   Operations  Manaqeaent 

Operations  management  supports  the  remote  manipula- 
tion of  various  network  elements  [Ref.  5:  p.  91].  Some  of 
the  forms  this  manipulation  takes  includes;  testing  a  piece 
of  hardware  such  as  an  adapter,  testing  specific  software 
such  as  a  process  which  counts  the  lumber  of  tines  an  indi- 
vidual packet  attempts  t:>  access  the  channel  before  it  is 
successful,  adjusting  parameters  in  order  to  effect  the 
values  of  certain  dependent  variables  which  characterize  the 
performance  of  the  network,  and  starting  up  a  renots  process 
within  a  node  which  acts  as  an  artificial  traffic  generator. 
Additionally,  during  the  process  of  network  reconfiguration, 
this  management  function  supports  the  remote  loading  of 
software  into  the  appropriate  network  element. 
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II.  DESIGN  ISSUES  IN  NETWDRK  MONITORING 

Measurements  allow  as  to  gain  valuable  insight  regarding 
network  usage  and  behavior  [Ref.  6:  p.  1439].  Ihey  provide 
a  means  to  evaluate  the  performance  of  the  implemented 
protocols.  Additionally,  they  give  the  designee  the  ability 
to  detect  network  ineff iciencies  and  identify  design  flaws. 
On  an  operational  level,  measurement  provides  the  statistics 
upon  which  the  network  is  tuned  through  adjustment  of  appro- 
priate parameters.  In  a  global  seise,  measurement  can  be 
seen  as  the  foundation  ipor.  which  network  management  is 
based.  Hamming  expresses  the  importance  of  measurement  in 
the  statement,  "It  is  difficult  to  have  a  science  without 
measurement"  This  emphasis  on  an  accurate  measurement  capa- 
bility assists  in  understanding  why  such  elaborate  and 
complex  measurement  technigues  have  been  devised  for  experi- 
mental and  operational  networks. 

Before  any  type  of  measurement  is  conducted  of  a  network 
or  it's  associated  components,  two  basic  guestions  must  be 
answered.  They  are,  •What  is  to  be  measured?1,  and  'Why 
should  the  measurement  be  taken?*.  These  guestions  will  be 
addressed  in  Chapters  3  and  4  respectively.  At  this  time, 
an  explanation  cf  basic  monitoring  methodologies  will  be 
undertaken,  followed  by  a  discussion  of  current  monitoring 
technologies. 

A.   NETWORK  MONITORING  METHODOLOGIES 

Currently,  there  exists  three  basic  nethodologies 
utilized  as  the  foundations  for  tha  creation  of  various 
network  monitoring  technologies.  These  three  methods  are 
hardware   monitoring,    software    monitoring,    and   hybrid 
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monitoring.  These  methods  will  be  discussed  for  the  purpose 
of  establishing  a  basis  upon  which  the  monitoring 
technologies  may  be  analyzed. 

1-   Hardware  Methodology 

A  pure  hardware  moiitor  is  a  unit  that  is  both  phys- 
ically and  logically  distinct  from  the  network  component 
being  measured  [  Ref .  7:  p.  57]. 

The  interface  between  the  monitor  and  the  component  is  a 
physical  probe  used  for  the  collection  and  passing  of  elec- 
tronic sicfnals  from  the  coaponent  to  the  monitoring  device. 
Figure  2.1  depicts  a  generalized  hardware  monitoring  device 
[Ref.  7:  p.  57]. 


1 


Online 

Analysis 

Unit 


Network 

Component 

l 

Signal  Filter 
Log. 

And  Combination 
Lc  Unit 

i 

Time  And  Count  Unit 

Figure  2.1    Hardware  Monitoring  Device:  Logical  View. 
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The  critical  item  needed  foe  a  hardware  monitor  is 
an  electronic  signal  that  indicates  the  occurence  of  an 
event  [  Ref  •  7:  p.  57].  Sinca  many  signals  to  be  monitored 
ara  of  a  relatively  low  voltaga,  ons  must  consider  that  tha 
introduction  of  a  monitoring  devica  nay  disturb  tha  normal 
operation  of  the  circuit  baing  monitored.  To  preclude  this, 
a  high  impedance  probe  can  be  utiilzad.  The  signal  observed 
by  the  probe  is  amplifiad  and  passed  to  a  signal  filter  and 
combination  logic  unit.  The  task  of  the  signal  filter  and 
combination  logic  unit  is  to  mask  and  combine  signals 
received  from  various  prooas.  This  output  is  then  sent  to  a 
time  and  count  unit.  Hare,  the  duration  of  a  specific 
•signal,  or  the  number  of  times  a  cartain  signal  occures  can 
be  recorded.  Having  collacted  tha  raguired  data  appropriate 
fee  the  test  being  conductad,  the  oontents  of  the  time  and 
count  unit  can  be  directed  to  a  mass  storage  device  for  off 
line  analysis  or,  directly  to  a  user  for  on-lina  analysis. 

The  main  advantage  of  a  hardware  monitor  is  it's 
ability  to  sense  a  wida  range  of  hardware  and  software 
events.  In  addition  to  oost,  the  main  disadvantage  of  a 
hardware  monitoring  devica  is  it's  limited  ability  to  detect 
the  stimulus  for  the  set  of  signals  It  is  monitoring. 

2-   Software  Methodology 

Although  various  definitions  exist,  a  software 
monitor  can  be  viewed  as  a  process  which  resides  in  the 
component  being  monitored.  Two  types  of  software  monitors 
exist  which  are  appropriat  e  for  tha  task  of  monitoring  a 
computer  network.  They  ire  the  ints rrupt-intarcept  method- 
ology and  the  sampling  methodology  "3af.  7:  p.  56]. 

The  int err upt-int ercep t  methodology  ambraces  the 
idea  of  carrying  out  some  type  of  monitoring  activity  every 
time  the  state  of  the  particular  resorce  in  which  the 
monitor   is  resident   chaigss.    The   monitoring  routine   is 
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invoked  whenever  an  interrupt  is  janerated.  The  schems 
calls  for  intercepting  aach  interrupt  as  it  occures, 
directing  it  to  a  monitoriig  routine  where  tha  interrupt  is 
analyzed  and  appropriate  monitoring  functions  activated,  and 
finally,  passing  the  interrupt  to  it's  intended  destination. 
This  monitoring  methodology  has  tha  distinct  advantage  of 
allowing  measurements  to  ba  taken  as  an  integral  part  of  tha 
system  rather  than  as  a  lower  laval  application  program. 
Substantial  amounts  of  processing  tina  and  memory  utlization 
ara  required  for  this  nathod.  additionally,  it  also 
requires  that  the  softwara  monitoring  program  ran  at  a  very 
high  priority  to  prevent  other  interrupts  from  deactivating 
tha  monitor  [Raf.  7:  p.  57]. 

The  sampling  methodology  traats  the  software  moni- 
toring program  as  a  normal  issr  program  for  a 
multiprogramming  system.  The  activation  of  the  monitoring 
program  may  be  accomplished  by  the  component  resident  oper- 
ating system,  by  another  nonitoring  application  program,  or 
by  network  operations  personnel.  Tils  activation  may  occure 
at  random  intervals,  scheduled  intervals,  or  a  combination 
thereof.  The  selection  of  inter-sample  periods  is  critical 
in  that  it  must  not  be  synchronized  with  the  occurence  of 
events  which  are  being  measured  by  the  monitor  [  Ref  .  7:  p. 
57].  As  with  the  interr upt-int ercapt  methodology,  a  signif- 
icant amount  of  processor  time  and  memory  space  may  be 
required. 

The  principal  advantage  of  the  software  monitors 
presented  above  is  their  ability  to  associate  occurances  of 
measured  events  with  their  causes.  The  primary  disadvantage 
is  their  requirement  for  substantial  resource  utilization. 
The  strengths  of  the  hardware  and  software  monitoring  meth- 
odologies have  been  combined  and  their  weaknesses  eliminated 
through  the  use  of  a  hybrid  approach. 
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3.   Hybrid  Methodology. 

In  contrast  to  tha  hardware  monitoring  methodology, 
the  hybrid  approach  to  monitoring  do  =  s  not  view  tha  hardware 
monitoring  device  as  being  invisible  to  the  natwork  compo- 
nent. The  hybrid  methodology  utilizes  a  microcomputer  to 
control  the  functions  of  the  hardware  monitoring  device  in 
response  to  data  gathered  by  hardware  probes.  Figure  2.2 
represents  the  logical  view  of  a  hyorid  monitoring  device. 
The  data  channel  provides  a  means  by  which  the  software 
monitor  resident  in  the  device  being  monitored  can  communi- 
cate with  the  hardware  uonitoring  device.  Along  this 
channel  can  pass  interrupts  and  nassages  concerning  the 
occurence  of  software  eveats  within  the  component.  These  ear- 
then be  associated  with  signals  sensed  by  the  probes  of  the 
hareware  monitoring  devica.  This  overcomes  the  strict  hard- 
ware monitoring  methodology^  inability  to  associate  a 
signal  with  a  specific  evant  occarance  within  the  network 
component.  Additionally,  the  problem  of  component  resorce 
utilization  associated  with  the  strict  software  monitoring 
methodolcgy  is  overcome  ay  tha  transition  of  various  moni- 
toring functions  from  the  network  component  to  the  hardware 
monitoring  device. 

Technologies  for  tie  location  of  monitoring  capabil- 
ities within  a  computer  natwork  implicitly  utilize  one  of 
tha  methodologies,  or  a  variation  thareofr  discussed  above. 
A  number  of  these  technologias  will  be  discussed  in  the 
following  section. 

B.   NETWORK  MONITORING  TECHNOLDGIES 

There  are  certain  considerations  that  should  be 
addressed  when  selecting  a  nonitoring  technology. 
Initially,  a  decision  has  be  be  made  on  whether  of  not  a 
record  of  every  cccurance  of  a  certain  event  should  be  made. 
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Figure  2.2   Hybrid  Moaitoring  Device:  Logical  View. 

sometimes  called  trace  moaitoring  "Ref.  7:  p.  53],  or  zo 
collect  samples  from  the  network  at  selected  in-ervals  of 
time.  Timing  considerations  for  the  NBSNEr  measurement 
system  indicate  complete  neasurement  is  possible  [ Ref .  8:  p. 
725].  It*s  architecture,  bsing  similar  to  that  of  the 
SPLICE  LAN  would  seem  to  indicate  taat  complete  measurement 
would  be  possible  for  the  SPLICE  LMtf.  Whether  this  would  be 
desirable  or  practical  ire  questions  that  remain  to  be 
addressed.  The  technique  selected  mist  be  able  to  monitor 
both  hardware  and  software  components  individually  and  any 
combination  thereof.  Tie  level  of  monitoring  to  be 
conducted  must  be  determined.  Does  -he  technology  under 
consideration  provide   :hr  capability  of  both   a  macroscopic 
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ani  microscopic  level  of  monitoring?  Is  the  monitoring 
technique  capable  of  supporting  a  real-tiae  analysis 
requirement?  To  what  degree  does  the  monitoring  technique 
introduce  artifact  into  the  systsn?  Other  items  to  be 
considered  include;  clock  resolution  and  clock 
synchronization. 

1 •   Sidestream  Monitoring 

The  sidestream  monitoring  technology  [ Ref .  5:  p. 
92],  requires  that  probes  be  attached  to  the  side  of  network 
components.  3y  attachiig  these  probes  to  the  'side'  of 
network  components,  we  mean  physically  placing  them  such 
that  they  may  sample  and  analyze  data  from  physical  inter- 
faces within  the  component,  and  at  tie  interface  between  the 
component  and  network  bus.  These  probes  extract  and  analyze 
data  from  physical  interfaces  established  with  these 
elements.  Additionally,  the  sidestream  technique  obtains 
information  about  the  network  interface  and  the  subnetwork 
through  the  use  of  a  meaurenent  nodule  resident  in  the 
adaptor.  Information  gathered  by  these  probes  and  modules 
may  be  sent  to  a  network  nanitoring  center,  or  to  a  set  of 
management  programs  via  a  secondary  channel  which  is 
frequency-division  multiplexed  ontD  the  same  circuit  being 
used  by  the  primary  data  channel. 

A  major  advantage  of  the  sidestream  technique  is 
it's  ability  tc  alert  network  operations  personnel  of 
certain  types  of  problems  without  iiterferring  with  normal 
data  traffic.  Certain  tssts  may  aLso  be  undertaken  which 
utilize  this  secondary  channel.  In  this  way,  isolation 
testing  may  take  place  without  disrupting  the  primary  data 
channel.  Even  though  a  secondary  channel  assists  in 
isolating  and  correcting  certain  problems,  there  still 
reiain  certain  tests  that  must  utilize  the  primary  data 
channel  for  their  accomplishment.   This  has  been  found  to  be 
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one  of  the  major  problems  of  the  slipstream  technique  due  to 
the  fact  that,  during  the  conduct  of  these  tests,  the 
network  is  unavailable. 

The  sidestream  monitoring  tschnology  presented  here 
is  a  subset  of  a  more  encompassing  network  management 
philosophy.  Our  discussion  has  briefly  touched  on  the  topic 
of  component  failure  identification.  This  was  determined  to 
be  necessary  in  order  to  more  olearly  define  and  explain  the 
advantages  and  disadvantages  of  this  technology.  This 
subject  will  be  addressed  again  when  a  discussion  of  various 
techniques  for  identifying,  isolating,  and  correcting 
failing  network  components  is  undertaken  in  Chapter  4 

2-   Mainstream  Monitor! nq 

The  mainstream  monitoring  technique  operates  thru 
the  use  of  hardware  and  software  implemented  among  existing 
network  components.  Ihsse  additions  provide  data  to  a 
network  monitoring  center  or  a  set  of  network  management 
programs.  Notification  of  problens  existing  within  the 
network  is  accomplished  through  the  generation  of  asynchro- 
nous problem  messages.  These  messages  are  communicated  as 
normal  data  traffic  on  the  primary  data  channel.  Data 
provided  by  these  asynchronous  probLem  messages  is  usually 
sufficient  to  isolate  a  problem  to  a  particular  component 
without  further  problem  isolation  tests  such  as  those 
reguired  by  the  sidestream  method.  Error  records  within  a 
problem  message  contain  specific  information  concerning  the 
problem  being  reported.  Information  contained  in  the  error 
records  is  generated  by  testing  modules  resident  in  the 
network  components,  which  are  invoked  upon  problem  recogni- 
tion. If  information  contained  within  "The  problem  record  is 
insufficient  to  isolate  the  oause  of  a  specific  problem, 
additional  isolation  testing  is  initiated. 
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The  major  advantage  of  tha  mainstream  monitoring 
technique  is  it's  ability  to  isolate  and  diagnose  a  problem 
based  upon  information  contained  in  ths  problem  message,  A 
problem  with  this  technigue  evolves  around  the  requirement 
for  these  problem  messages  to  utilize  the  primary  data 
channel  for  transmission  ts  ths  central  monitoring  site.  If 
an  adaptor  is  down  through  which  tha  message  nust  pass,  or 
if  the  subnetwork  conqestai ,  the  problem  message  may  experi- 
ence some  delay  before  being  communicated  to  the  central 
monitoring  site. 

Like  the  sidestrsam  technigu^,  the  mainstream  tech- 
nology presented  here  is  a  subset  of  a  more  encompassing 
network  management  philosophy.  Discussion  of  problem  iden- 
tification and  isolation  was  inclided  for  clarification 
purposes.  Additional  discussion  on  the  subject  of  component 
failure  identification,  isolation,  and  correction  will  be 
undertaken  in  Chapter  4. 

3«   Centralized  Monitoring 

A  broadcast  network  lands  itself  naturally  to  a 
centralized  measurement,  approach  "  Ref .  8:  p.  725]. 
Centralized  monitoring  requires  modification  of  tha  adaptor 
connecting  the  processor  which  houses  the  network  management 
function  to  the  bus.  Through  this  molif ication,  tha  adaptor 
can  monitor  all  packets  on  the  network.  Some  of  the  infor- 
mation which  can  be  extracted  and  iatermined  from  monitoring 
packets  transiting  the  network  includes;  packet  size,  number 
of  packets  of  each  type  transmittal,  and  intararrival  time 
since  last  packet.  Since  the  modified  adaptor  simply  makes 
a  copy  of  the  passinq  packet,  extracts  the  required  informa- 
tion from  it,  and  discards  tha  copy,  no  artifact  is  being 
introduced  into  the  system. 
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Certain  important  information  cannot  be  obtained 
utilizing  the  centralize!  monitoring  technique.  The  time 
between  arrival  of  a  paotet  at  the  network  interface  and 
it1 s  subsequent  transmission  oq to  tha  network  is  only  avail- 
able at  the  interface.  Thus  we  ha/5  no  measurement  of  the 
effectiveness  of  our  access  protocol.  Although  a  collision 
on  the  network  can  be  detected  by  the  central  monitor,  it  is 
not  capable  of  determining  whioh  nodes  packets  ware  involved 
in  the  collision. 

the  central  monitor  is  biased.  This  is  caused  by  the  propa- 
gation delay  between  the  sanding  adaptor  and  the  monitoring 
adaptor.  Figure  2.3  dapiots  a  local  network  with  central- 
ized monitoring. 


Mini- 
Computer 


Adapter 


Mini- 
computer 


Host 
Cormouters 


Adapter 


Adapter 


if  ied 
tirface 


s 


Network 

Mgt. 
Function 


Figure  2.3   Centralized  Monitoring. 
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4.      Decentralized   Monitoriaj 

Figure  2.4  reprssents  a  decentralized  monitoring 
scheme-  In  using  this  approach,  the  burden  of  network  moni- 
toring is  placed  on  =  ach  individual  interface.  The 
functions  of  the  central  monitoring  site  no  longer  include 
monitoring.  The  tasks  performed  by  the  central  monitoring 
site      are      now      restricted      to        dati      collection      from      the 
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Figure  2.4   Dscentralized  Monitoring. 


adaptors,  data  reduction  and  data  analysis.  Measurement 
information  is  obtained  by  the  central  monitoring  site 
through  the  receipt  of  information  packets  generated  by  the 
individual  adaptors.  Transmission  of  measurement  informa- 
tion  may   be  as   frequent   as   with  avery   packet.     Other 
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protocols  call  for  the  transmission  Df  measured  information 
after  a  certain  amount  of  time  has  elapsed,  or,  after  a 
certain  number  of  events  have  occurs!. 

With  a  decentralized  approach  all  information  about 
the  network  traffic  is  available  [Ref.  8:  p.  725]. 
Information  about  collision  induced  delays  aad  collision 
counts  can  be  obtained  from  each  adaptor.  Exact  times  for 
packet  transmission  and  receipt  are  available.  Another 
positive  attribute  of  decentralized  monitoring  is  the 
ability  to  identify  these  aodes  whose  packets  were  involved 
in  a  collision.  To  provide  this  enhanced  service,  addi- 
tional memory  and  real  tine  clocks  mist  be  incorporated  into 
each  network  interface.  Additionally,  the  periodic  trans- 
mission of  data  to  the  central  monitoring  site  requires 
overhead  communication.  If  sent  over  dedicated  lines,  as 
depicted  in  Figure  2.4  ,  sxtra  costs  are  incurred.  If  these 
information  packets  are  sent  over  tae  primary  data  channel, 
artifact  is  introduced  into  the  system.  Finally,  since  this 
technique  requires  that  all  adaptors  in  the  network  possess 
a  greater  than  normal  degree  of  intelligence,  implementation 
and  maintenance  tend  to  oe  more  oostiy  than  centralized 
monitoring. 

5«   HykzH  Monitoring 

The  hybrid  monitoring  tsohaique  grew  out  of  the 
advantages  and  disadvantages  of  the  centralized  and  decen- 
tralized technologies.  In  this  approach,  as  much 
information  as  possible  Is  collected  by  the  central  moni- 
toring site.  Only  those  measuremeats  unobtainable  by  the 
central  monitor  are  measured  by  each  network  interface. 
This  allows  for  minimal  nodificatioi  to  the  network  inter- 
face. Figure  2.5  represents  the  hybrid  monitoring 
technique. 
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Ths  transmission  of   data  to  the  central  monitoring  site  is 

initiated  upon  the   termination  of   a  logical   connection. 

Implementation  of  this  protocol  reduces  the  introduction  of 
artifact  into  the  system. 
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Figure  2.5    Hybrid  Monitoring. 

In  combining  the  advantages  and  eliminating  the 
disadvantages  of  centralized  and  decentralized  monitoring, 
the  hybrid  monitoring  technology  has  provided  the  network 
with  an  accurate  and  oomprehansive  measurement  and  moni- 
toring capability.  Ons  disadvantage  deals  with  the 
complexity  of  coordinating  the  analysis  of  decentralized  and 
centralized  measurement  [3ef.  3:  p.  725]. 
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C.   CHAPTER  SUMMARY 

This  Chapter  began  with  an  explanation  of  hardware, 
software,  and  hybrid  monitoring  methodologies.  Strengths 
and  weaknesses  of  each  were  discussed.  Those  attributes, 
both  positive  and  negative,  associated  with  software  and 
hardware  monitoring  were  found  to  be  the  criteria  upon  which 
the  development  of  the  hybrid  approach  was  based. 

In  the  second  section  of  this  Ziapter,  implementations 
of  the  basic  methodologies  were  presented.  The  monitoring 
technologies  addressed  were;  silestream,  mainstream, 
centralized,  decentralized",  and  h/brid  monitoring.  The 
discussion  of  each  technology  Lncluiel;  a  brief  explanation 
of  the  operation  of  the  monitoring  technique,  presentation 
of  advantages  and  disadvantages,  and  in  some  cases,  compar- 
ison to  other  monitoring  technologies. 

Each  one  of  the  monitoring  technologies  presented  is 
capable  of  providing  adequate  monitoring  and  measurement 
capabilities  for  use  by  the  SPLICE  LAN  management  function. 
It  is  proposed  that  the  hybrid  technique  be  adopted  as  the 
monitoring  technology  utilized  by  the  SPLICE  LAN.  This 
technique  emphasizes  the  concept  of  minimizing  data  collec- 
tionat  network  interfaces.  DnLy  those  measurements 
unobtainable  by  the  central  monitor  would  be  gathered  by  the 
adaptors.  As  in  the  mainstream  monitoring  technology,  each 
adaptor  would  be  capable  of  problem  detection  and  invoking 
local  test  modules  which  would  gatier  data  concerning  the 
problem  for  subsequent  transmission  to  the  cantral  moni- 
toring site.  Data  collected  oy  the  adaptors  would  be  sent 
to  the  central  monitoring  site  as  administrative  packets 
over  the  primary  data  channel.  In  addition  to  the  transmis- 
sion of  routine  measurement  information  upon  the  termination 
of  each  logical  connection,  problea  messages,  similar  to 
that  implemented  by  the  mainstrean  technology,  will  be 
transmitted  asynchronously  to  the  central  monitoring  site. 
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III.  NETWORK  MEASUREMENT  TDOLS,  fcND  MEASUREMENTS  AND 

STATISTICS 

To  this  point,  the  general  architecture  of  a  SPLICE  LAN, 
along  with  a  proposed  monitoring  methodology  has  been 
presented.  Now  that  a.  nethod  exists  which  allows  us  to 
obtain  information  from  the  network,  the  focus  of  this 
thesis  changes  to  address  the  question;  What  measurements 
and  statistics  should  we  ba  able  to  derive  from  the  network; 
in  support  of  experimental  and  operational  functioning?  An 
attempt  will  not  be  made  to  itemiza  n easur ements  and  statis- 
tics required  for  the  a  ccomplishmar.t  of  each  specific 
experimental  or  operational  enievor.  Rather,  a  discussion 
of  basic  measurement  tools  will  ba  indertaken,  followed  by 
tha  identification  and  applanation  of  measures  and  statis- 
tics appropriate  for  use  in  managing  local  araa  networks 
which  must  interface  with  the  DDN  aid  where  control  of  the 
dominent  DDN  does  not  coma  under  tha  authority  of  the  LAN 
managers. 

A.   NETWORK  MEASUREMENT  T33LS 

In  order  to  evaluate  the  performance  of  a  network,  and 
to  identify  down  or  failing  ccmponants,  several  measurement 
tools  must  be  available.  rhesa  tools  are:  cumulative 
statistics,  trace  statistics,  snapshot  statistics,  artifi- 
cial traffic  generators,  amulation,  a  network  measurement 
center  which  includes  control,  collaction  and  analysis  of 
data,  and  a  network  control  center  which  accomplishes  status 
reporting,  monitoring,  and  controling  the  network.  These 
latter  two  tools  may  be  combined  into  a  single  entity  which 
is  sometimes  called  a  moiitoring  canter.  Each  of  these 
tools  will  be  addressed  in  the  following  section. 
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1 .   Cumulative  Statist L  cs 

Cumulative  Statistics  consist  of  data  regarding  a 
variety  of  events,  accumulated  over  a  given  period  of  time. 
These  are  provided  in  the  form  of  sums,  frequencies,  and 
histograms  [Ref.  6:  p.  1439],  This  tool  is  oaa  that  should 
be  included  in  the  capabilities  of  the  SPLICE  local  area 
computer  network  measurement  facility.  Since  seme  cumula- 
tive statistics  can  become  quite  long,  it  is  wise  to  control 
thsir  transmission  to  the  central  site  in  soma  way.  One 
approach  might  be  to  designate  certain  items  within  a  cumu- 
lative statistical  message  as  being  optional.  This  provides 
network  operations  personnel  with  many  measurement  capabili- 
ties, yet  precludes  the  formulation  and  transmission  of 
excessively  long  cumulati/e  statistical  messages. 

2  -   !!§:<£§  Statistics 

Trace  statistics  iLlow  network  operations  personnel 
to  literally  follow  a  packet  through  the  network  and  to 
learn  of  the  route  that  it  takes  and  tne  delays  it  encoun- 
ters [Ref.  10:  p.  633].  Obviously,  in  a  bus  oriented 
network,  there  does  not  exist  a  requirement  to  identify  the 
route  a  packet  has  taken  to  it's  destination.  although  more 
applicable  to  a  packet  switched  store  and  forward  network, 
certain  aspects  of  a  trace  mechanism  may  prove  useful  in  a 
local  area  network.  Suci  an  area  might  include  possibly 
timestamping  the  packet  as  it  irrived  at  the  adaptor  from  a 
processor,  and  subsequently  recording  the  time  the  packet 
was  successfully  transmitted.  additionally,  the  packet 
could  be  timestamped  whei  it  arrived  at  the  destination 
adaptor  and  subequently  record  the  time  at  which  the  packet 
is  forwarded  to  the  resident  processor.  These  statistics 
can  then  be  forwarded  to  a  central  monitoring  site  upon 
demand  or  at  some  predet eci inei  time. 
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3.   Snapshot  Statistics 

Snapshot  Statistics  provide  an  instantaneous  look  at 
a  device  showing  it's  state  with  regard  to  various  queue 
lengths  and  buffer  allocation  [Ref.  5:  p.  1440].  In  a  high 
speed,  dynamic  environment  such  as  a  local  area  network, 
these  types  of  statistics  can  prove  valuable  in  the  evalua- 
tion of  certain  protocols.  Evaluation  of  a  network  access 
protocol  could  be  conducted  by  observing  the  length  of  the 
•packets  ready  fcr  transmission  queue'.  Additional  informa- 
tion that  could  be  contaiied  in  a  snapshot  of  a  particular 
network  component  or  set  of  components  are  processor  queue 
lengths,  storage  allocation,  and  status  of  adaptor  buffers 
for  receipt  and  transmission  of  packets. 

**•   Artificial  Traffic  Generators 

The  use  of  artificial  traffic  generators  provides 
network  operators  with  tie  ability  to  create  streams  of 
packets  with  specified  durations,  inter-packet  gaps,  packet 
lengths  and  other  appropriate  characteristics  [Ref.  6:  p. 
1440].  This  tool  plays  a  major  role  during  the  implementa- 
tion of  the  LAN,  In  the  absence  of  sufficient  traffic  to 
test  certain  aspects  of  the  network,  artificial  traffic 
generators  provide  the  mechanism  through  which  varying 
network  load  conditions  can  be  simuLated.  This  provides  a 
more  realistic  environment  in  which  testing  may  be 
conducted,  and  provides  a  mechanism  that  can  be  used  to 
identify  network  problem  ireas.  3y  doing  this,  we  are  able 
to  effect  modifications  to  the  LAS  while  it  is  in  it's 
infancy  rather  than  attempting  to  make  changes  when  produc- 
tion activities  are  heavy  and  corrections  more  expensive. 
Additionally,  artificial  traffic  geierators  may  be  used  to 
test  and  analyze  various  network  protocols.  This  is  accom- 
plished  through  the  generation   of  identical   transmission 
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strings,   thereby  providing  a  basis   upon  which  the  perform- 
ance of  the  various  protocols  can  be  compared. 

Amer  [Ref.  8:  p.  726],  has  iientified  five  capabili- 
ties that  should  be  possessed  by  an  artificial  traffic 
generator.  These  include  the  abiLity  to:  1)  generate 
packets  with  a  constant,  uniform,  oc  Poisson  size  distribu- 
tion, 2)  generate  packets  with  constant,  uniform,  or 
exponential  interarrival  times,  3)  direct  packets  to  any 
specified  destination,  4»  communicate  with  tie  monitoring 
system  to  synchronize  traffic  generation  and  data  collection 
and  5)  permit  on-line  operations  personnel  control. 

5 .   Emu la ti o n 

Emulation  is  the  creation  of  an  illusion  that  there 
exists  more  components  of  a  certain  kind  in  the  network  then 
actually  exists.  Each  one  of  thess  "fake"  components  is 
capable  of  displaying  the  characteristics  of,  acid  performing 
the  functions  of  a  "real"  physical  oonponent  of  that  type. 
Emulation  is  reguired  wien  there  are  not  enough  network 
components  to  provide  sufficient  traffic  generation  and  a 
range  of  nodal  characteristics.  In  supplimenting  these 
areas,  emulation  gives  the  operator  a  better  understanding 
of  network  behavior  under  various  configurations  .  Closely 
related  to  this  is  the  us=  of  emulation  in  conjunction  with 
capacity  planning.  Througi  emulation,  we  are  able  ~o  deter- 
mine what  effect  a  change  to  the  network  configuration  will 
have  on  various  performance  measures.  A  situation  in  which 
emulation  might  be  employed  would  be  to  determine  the  affect 
of  adding  or  deleting  a  host  processor  from  the  5PLICE  local 
ar=a  computer  network. 
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6.   Network  Measurement ^Control  renter 

In  the  early,  experimental  days  of  the  ARPA  Computer 
Network,  there  existed  physically  separate  measurement  and 
control  centers.  This  allowed  for  saatinual  experimentation 
with  the  network  from  the  measureaei t  center,  while  actual 
control  of  the  network  4  as  conducted  from  the  control 
center.  These  two  functions  of  measuring  and  controlling 
have  been  combined,  and  will  be  undertaken  by  a  single  moni- 
toring center  for  the  Defense  Data  Network  [Ref.  11:  p. 
95-102].  The  responsibilities  of  a  Network 
Measurement/Control  Center  include:  controlling  the  meas- 
urement facilities,  collecting  and  analyzing  data, 
generating  status  reports,  and  monitoring  and  controlling 
the  network.  These  responsibilities,  as  they  apply  to  a 
local  computer  network,  4  ill  be  addressed  in  much  greater 
detail  in  Chapter  5. 

B.   MEASUREMENTS  AND  STATISTICS 

Now  that  the  measurement  tools  iave  been  identified  and 
discussed,  the  guestion  arises,  'How  do  we  select  and  imple- 
ment the  appropriate  tools  for  tie  measurement  task  at 
hand?'.  Before  this  subject  can  be  addressed,  the  answers 
to  two  guestions  posed  in  Chapter  2  must  be  determined. 
Those  questions  were:   Why  should   tie  measurement  be  taken? 

(i.e.    What  managerial   and  research   questions   are  to   be 
answered  by  the  measurement?),  and,   What  is  to  be  measured? 

(i.e.  What  specific  network  characteristics  must  be  measured 
in  order  to  satisfy  these  questions?)  . 

An  approach  to  answering  these  two  questins  is  as 
foLlows.  Initially,  it  is  appropriate  that  the  object  of 
the  measurement  operation  be  defined.  This  entails  the 
identification  of  some  specific  area  of  the  network  to  be 
investigated.    In  conjunction  with  this,    the  goals  of  the 
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measurement  operation  are  solidified.  The  next  step  is  to 
select  those  performance  saasures  that  best  characterize  the 
area  of  the  network  being  studied.  Finally,  the  specific 
measurement  tools  most  appropriately  suited  for  the  measure- 
ment operation  are  identified  and  selected  for 
implementation. 

Goals  of  measurement  operations  are  usually  motivated  by 
a  desire  for  software  verification,  for  performance  evalua- 
tion and  verficaticn,  to  obtain  feedback  for  system  design 
iterations,  to  identify  down  or  failing  components,  and  to 
study  user  behavior  and  characteristics.  Performance  meas- 
ures can  be  catagorized  as  basic,  special,  or  composit. 
Examples  of  basic  measurements  include  throughput  and  delay, 
when  examination  of  a  specific  procedure  is  required, 
specialized  measures  must  be  used  to  compliment  the  basic 
measures.  They  are  aimed  at  measuring  a  specific  attribute 
of  a  specific  network  component.  Finally,  in  order  to 
analyze  some  global  system  properties  which  cannot  be  easily 
describee'  by  throughput  and  delay,  it  becomes  necessary  to 
aggregate  a  set  of  measurements  that  have  been  taken  over  a 
specific  monitoring  perioi.  This  aggregation  of  measures  is 
called  composite  measurement.  Examples  of  composite  meas- 
ures include,  fairness,  congestion  protection,  stability, 
robustness  of  network  algorithms  to  line  errors,  and  reli- 
ability of  a  network  configuration  with  respect  to  component 
failures  [Ref.  6:  p.  1443]. 

Returning  to  the  subject  of  measurement  tool  selection 
and  implementation,  we  find  that  a  subset  of  these  tools 
haze  been  utilized  in  obtaining  specfic  data  from  an  opera- 
tional local  area  computer  network  [Ref.  8].  In  describing 
certain  reports  generated  from  daca  collected  throughout 
this  network,  the  researcher  will  ba  attempting  to  show  how 
the  tools  are  selected  and  integrated  inorder  to  provide 
network   operations   personnel  with   accurate,    timely   and 
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sufficient   information  upon  which   the   managament  of   the 
network  may  be  based. 

It  would  be  infeasibla  to  identify  and  provide  rationale 
for  every  measurement  that  could  be  taken  from  a  local  area 
network.  Additionally,  this  list  would  be  a  dynamic  one, 
dependent  upon  the  goals  and  objectives  of  the  specific 
network  analysis  operation  to  be  undertaken.  For  these 
reasons,  an  established  us  asur  eraent  capability  for  a  local 
area  computer  network  will  be  investigated  in  an  attempt  to 
bring  forth  and  discuss  tha  impleaeatation  of  various  meas- 
urement tools  as  they  partain  to  the  SPLICE  LAN.  Ten 
performance  reports  have  been  implemented  for  measuring 
NBSNET  traffic  [Hef.  8].  Each  raport  is  classified  as 
either  traffic  charaterizat ion  or  performance  analysis  type. 
Traffic  characterization  reports  indicate  the  workload 
placed  on  the  system.  Parforaanca  characterization  reports 
indicate  the  time  delays,  utilizations,  etc.,  which  result 
from  a  given  load  and  network  configuration.  They  describe 
the  dependent  variables  that  are  observed  rather  than  cont- 
rolled, and  are  used  for  tuning  the  network  and  making 
performance  comparisons  [Raf.  3:  p.  726].  At  this  time,  a 
brief  description  of  each  report  will  be  given,  along  with 
appropriate  comments  relating  to  raguirements  of,  and  recom- 
mentations  for  the  SPLICE  LAN. 

1  *   Host  Com mun icatioi  Matr ix 

The  Host  Communication  Matrix  indicates  the  traffic 
flow  between  connected  noias.  For  each  node,  data  tabulated 
includes:  the  total  number  of  packats,  data  packets  and  data 
bytes  received  from  and  sait  to  all  other  nodes.  From  this, 
tha  proportion  of  data  packets  to  total  packats,  and  data 
bytes  to  total  bytes  are  ieterminad.  Utilizing  the  moni- 
toring technigue  proposed  for  the  SPLICE  LAN  in  Chapter  2, 
this  information  could  be  obtained  by  the  central  monitoring 
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center  through  it's  tap  iito  the  channel.  In  this  way  the 
number  of  bytes  in  a  packet  can  be  counted  by  the  monitoring 
center,  and  the  source,  destination,  and  packet  type  deter- 
mined from  the  header.  Additionally,  a  summary  of  the  total 
network  traffic  is  made  available  which  includes  total 
packets,  data  packets,  and  data  bytes  transmitted,  and  the 
mean  number  of  data  bytes  per  packet. 

2-   Group.  Communication  Matrix 

The  Group  Communica tioi  Matrix  indicates  the  traffic 
flow  between  any  user  defined  groupings  of  nodes.  The  same 
type  of  information  tabuLated  by  the  Host  Communication 
Matrix  is  recorded  by  the  group  conn inicaticn  matrix.  The 
possible  extension  of  this  concept  to  include  the  recording 
of  the  traffic  flow  between  various  user  designated 
processes  may  prove  more  valuable  than  the  information  orig- 
inally seen  as  the  product  of  this  report.  An  example  of 
this  would  be  the  recognition  that  a  number  of  processes 
utilize  the  same  data  file  on  a  raguLar  and  possibly  concur- 
rent basis.  Assuming  this  data  is  currently  kept  on  a  tape 
storage  device  (which  is  not  out  of  the  guestion  in  a 
government  installation),  and  possessing  the  information 
provided  by  the  Group/Process  Communication  Matrix,  serious 
consideration  should  be  given  to  relocating  this  data  to  a 
faster  and  more  accessible  storage  device. 

3«   Z§.ck§l  Ty£e  Histogram 

The  Packet  Type  Histogram  reoords  and  summarizes  the 
distribution  of  each  type  of  packet  transmitted  on  the 
network.  A  simple  example  would  be  the  total  number  of  data 
packets  transiting  the  network  during  a  specified  monitoring 
period.  Gathering  data  to  be  utilized  in  constructing  a 
packet  type  histogram  can  be  easily  accomplished  by  a 
central  monitoring  site.     A  summary  of  packet   types  couii 
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provide  network  operations  personnel  with  information 
concerning  the  amount  of  'overhead'  data  in  relation  to  the 
amount  of  'pure'  data  being  transmitted.  Additionally,  it 
may  be  found  that  there  exists  certain  times  when  the 
network  may  be  carrying  a  disproportionate  amount  of  over- 
head data  as  a  result  of  component  failure,  excessive 
measurement,  or  excessive  aonitoring  requirements. 

4.   Data  Packet  Size  Histogram 

This  histogram  records  the  lumber  and  proportion  of 
data  packets  that  fall  into  a  class  of  specified  length. 
These  classes  can  be  either  preset  or  operator  defined.  For 
packets  of  fixed  size,  the  data  portion  alone  may  be  counted 
and  utilized  as  the  criteria  for  class  inclussion.  Variable 
size  packets  allow  for  a  strict  count  of  bytes  making  up  the 
entire  packet.  The  use  of  a  Data  Picket  Size  Histogram  can 
be  extremely  useful  in  a  network  utilizing  packets  of  a 
fixed  length.  If  the  average  or  mean  length  of  data  carried 
in  any  one  packet  is  substantially  below  the  carrying 
capacity  of  the  fixed  lata  field,  consideration  should  be 
given  to  reducing  the  size  of  the  fixed  data  field.  This 
will  reduce  the  amount  of  'excess  baggage'  being  carried  by 
packets  throughout  the  network.  Likewise,  if  packet  data 
fields  are  full  a  good  portion  of  the  time,  or  nearly  so, 
consideration  should  be  given  to  increasing  the  size  of  the 
data  field. 

5-   Throug_hp_ut-Otiiizat  ion  Distribution 

The  Throughput-Utilization  Distribution  indicates 
the  flow  of  bytes  on  the  network.  Both  information  (data) 
bytes  and  total  bytes  are  measured.  Information  bytes  do 
not  include  header  bytes,  or  unacki owledged  data  packets. 
Additionally,  bytes  involved  in  collisions  are  not  counted. 
Using   this  approach,    total   channel  throughput,    channel 
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utilization,  information  throughput  and  information  utiliza- 
tion can  be  determined  for  tha  network  .  In  this  way,  a 
true  picture  of  the  beneficial  usaga  of  the  network  can  be 
obtained.  Collecting  tha  aeasucsnaats  required  for  the 
creation  of  this  report  is  a  simple  task  which  can  be 
performed  by  the  central  nonitoring  site. 

6.   Packet  I nterarriyal  Time  Histogram 

The  Packet  Interarrival  Tins  Histogram  indicates  the 
number  cf  packet  interarrival  times  which  fall  into  partic- 
ular tine  classes.  An  interarrival  time  is  the  time  between 
consecutive  carrier  (network  busy)  signals.  This  measure- 
ment can  assist  in  determining  how  much  the  network  is  being 
used  and  what  percentage  of  the  tine  the  network  is  idle 
during  a  specified  monitoring  period.  If  a  large  percentage 
of  interarrival  times  fall  into  a  oLass  which  records  occu- 
rences of  large  interarrival  times,  then  it  is  safe  to 
conclude  that,  during  the  monitoring  period  in  question,  the 
network  was  not  highly  utiLized.  iJhsn  taking  these  measure- 
ments from  a  central  monitoring  site,  consideration  should 
be  given  to  the  fact  that  the  recorded  interarrival  times 
will  be  slightly  biased  due  to  the  propagation  delays 
between  the  adaptors  and  the  monitoring  site.  In  the  high 
speed  environment  of  a  local  area  network,  these  delays  are 
seen  as  being  negligible. 

7-   Channel  Acquisition  Delay,  Historgam 

The  Channel  Acquisition  Delay  Histograam  depicts  the 
time  spent  by  adaptors  contending  for  and  acquiring  the 
channel.  The  channel  acquisition  delay  begins  when  an 
adaptor  becomes  ready  to  transmit  a  packet  and  ends  when  the 
first  bit  is  transmitted  into  the  channel.  Included  is  all 
of  the  time  spent  deferring  due  to  a  busy  channel  and  the 
time  recovering  and  backing  off  from  one  or  more  collisions. 
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.From  these  measurements,  we  can  identify  for  each  interface, 
the  number  of  packets  whose  deferral  times  fell  into  various 
tine  classes.  When  using  a  CSMA/CD  access  protocol,  it 
woild  be  appropriate  to  assume  that,  under  similar  condi- 
tions, the  distributions  of  channel  acquisition  delay  times 
for  all  adaptors  should  appear  very  nuch  the  sane.  If  there 
is  some  variation,  this  is  a  good  indication  that  some  type 
of  problem  exists  within  a  particular  adaptor. 
Additionally,  the  mean  channel  acquisition  delay  time  and 
it's  associated  standard  deviation  can  be  determined  from 
data  contained  in  the  histogram.  The  collection  of  this 
data  must  be  accomplished  by  each  individual  adaptor. 
Results  of  the  measurements  taken  it  each  interface  must 
then  be  forwarded  to  the  central  noiitoring  site  on  demand, 
or  at  some  prearranged  time. 

8.   Communication  Delay.  Histoc[ran 

The  Communication  Delay  Histogram  indicates  the 
delays  that  adaptors  incur  in  communicating  packets  to  their 
destination.  Theoretically,  a  con nunicatioi  ielay  begins 
when  an  original  packet  becomes  reaiy  for  transmission  and 
ends  when  that  packet  is  received  by  the  destination.  By 
definition,  a  communication  delay  excludes  the  time  to 
generate  and  communicate  an  acknowledgment  packet.  back  to 
the  original  sender.  As  implenentei  by  the  NBSNET,  communi- 
cation delay  is  measurei  from  the  tine  at  which  a  packet  is 
ready  for  transmission  until  the  last  bit  of  the  packet  is 
transmitted  onto  the  channel.  This  value  is  saved  until  the 
transmission  is  acknowledged,  at  which  time  a  local  histo- 
gram is  updated.  From  this  it  can  be  seen  that  measurements 
must  be  taken  by  the  adaptor  and  seat  to  the  oentral  moni- 
toring site  upon  demand  or  at  a  predetermined  time.  With 
this  approach,  the  communication  delay  time  recorded  will 
not   include  the  time   to   propogate   the   signal   to   the 
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destination.  This  is  take n  into  consideration  when 
measuring  'one  hop1  delay.  Mthough  similar  to  communica- 
tion delay,  'one  hop1  delay  includes  propagation  delay  time, 
and  the  time  for  the  destination  to  communicate  it's 
acknowledgment  back  to  tha  source.  The  delay,  communication 
or  'one  hop',  measured  depends  upon  the  goals  and  objectives 
of  the  measurement  operation. 

9.   Collision  Count  Histogram 

This  Histogram  taoilates  the  number  of  collisions  a 
packet  of  any  type  encounters  before  being  transmitted. 
Interpretation  of  these  statistics  provides  an  indication  of 
the  efficiency  of  a  CSMA/CD  protocol  in  allowing  interfaces 
to  acguire  the  channel.  Recording  of  collision  information 
for  each  specific  packet  aust  be  accomplished  at  the  local 
level.  Every  time  a  packet  is  involved  in  a  collision,  a 
counter  in  the  packet  healer  is  incremented  by  one.  Upon 
successful  transmission,  the  number  of  collisions  incurred 
by  the  packet  prior  to  transmission  is  r^ad  directly  from 
-he  packet  header  by  the  central  monitoring  site. 
Transferring  information  in  this  manier  to  the  central  moni- 
toring site  would  require  a  modification  to  the  packet 
format  proposed  in  [  Ref .  2  ]•  Iiis  modification  would 
reguire  the  inclusion  of  a  field  for  the  number  of  colli- 
sions experienced  by  tie  packet  prior  to  successful 
transmission.  By  combining  collision  count  information  from 
throughout  the  network,  the  central  nonitoring  site  is  able 
to  determine  the  mean  numoer  of  collisions  per  packet  trans- 
mission and  it's  associated  standard  deviation  for  the 
entire  network. 
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10-   Transmission  Count  Histogram 

The  Transmission  Count  Histogram  indicates  the 
number  of  times  a  packet  is  transmitted  before  it  is  comiu- 
nioated  to  its  destination.  A  packet  is  communicated  when 
it  is  successfully  receive!  by  the  iitended  destination.  A 
packet  may  be  transmitted  but  not  communicated  due  to  a 
collision,  line  noise,  or  erroneous  transmission.  The 
number  of  times  a  packet  is  transmitted  before  it  is  commu- 
nicated can  be  detected  by  the  central  monitoring  site.  It 
does  -his  by  observing  paoket  sequence  numbers  and  is  thus 
able  to  recognize  the  first  through  the  last  times  a  partic- 
ular packet  is  transmitted  and  which  transmission  is  the 
communication.  Through  tie  use  of  this  histogram,  we  are 
able  to  determine  the  total  number  Df  packets  transmitted, 
the  total  number  of  ,  packets  successfully  communicated,  the 
mean  number  of  transmissions  prior  to  successful  communica- 
tion and  the  associated  standard  deviation.  Under  ideal 
conditions,  the  number  of  transmissions  per  communication  is 
one.  In  a  fully  operational  network  this  will  probably  not 
be  the  case,  the  actual  value  being  dependent  upon  the  load 
on  the  system  and  the  current  network  configuration. 

C.   CHAPTER  SOHHAEY 

We  began  this  Chapter  with  an  overview  of  the  various 
network  measurement  tools.  The  format  of  our  overview 
called  for  defining  a  specific  measurement  tool,  followed  by 
a  discussion  of  that  tool's  prominent  measurement  character- 
istics. The  tools  discussed  were:  cumulative  statistics, 
snapshot  statistics,  trace  statistics,  emulation,  artificial 
traffic  generators,  and  a  measurement/control  oenter.  The 
topic  of  measurement  tool  selection  and  implementation  was 
then  presented.  An  approach  was  offered  as  a  means  through 
which  measurement   tool  selection   could  take   place.    This 
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approach  requires  that  the  object  of  the  measurement  opera- 
tian  be  defined,  followed  ay  a  statenent  of  the  goals  of  the 
measurement  operation.  tfext,  the  performance  measures  that 
best  characterize  the  area  of  the  network  under  investiga- 
tion are  selected.  From  this,  the  measurement  tools  most 
appropriate  for  use  in  obtaining  tie  required  information 
from  the  network  are  identified  and  implemented. 

Having  concluded  that  it  would  be  infeasible  to  identify 
ani  provide  rationale  foe  every  aeasurement  that  could  be 
taken  from  a  local  area  network,  a  discussion  concerning  the 
measurements  currently  being  taken  on  an  operational  local 
area  computer  network  *as  entered  into.  Ten  performance 
reports  implemented  on  the  NBSNET  were  explained  and  their 
relevance  to  the  SPLICE  Lk$  discussed.  These  reports  were: 
Host  Communication  Matrix,  Sroup  Communication  Matrix, 
Packet  Type  Histogran,  Daca  Packet  Size  Histogram, 
Throughput-Utilization  Distribution,  Packet  Interarrival 
Time  Histogram,  Channel  Acquisition  Delay  Histogram, 
Con munication  Delay  Histogram,  CoLlision  Count  Histogram, 
and  Transmission  Count  Histogram. 

The  question  ,  'How  much  of  the  network  traffic  should 
be  measured?*,  was  implicitly  addressed  in  our  discussion  of 
artificial  traffic  generators.  Basically,  two  approaches 
exist.  Ey  measuring  everything  on  the  network  it  would  be 
possible  to  totally  reconstruct  the  original  traffic.  Some 
problems  exist  with  this  approach.  First  of  all,  there  would 
be  a  prohibitive  amount  of  storage  required  for  the  data 
collected  from  the  network.  Secondly,  the  review  and  anal- 
ysis of  this  information  would  take  an  excessive  amount  of 
time.  Finally,  it  may  be  found  that  adaptors  are  spending  an 
inappropriate  amount  of  time  collecting  and  processing  meas- 
urement data. 
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The  second  approach  to  network  measurement  employs  a 
sampling  technique.  Hera,  perf ocnance  measurements  are 
constructed  only  from  a  sibset  of  the  total  packets  tran- 
siting the  network.  Measi remants  :ia  be  randomly  taken  of 
the  normal  packets  flowing  on  the  network,  or  from  those 
packets  created  explicitly  for  measurement  purposes  by  an 
aritficial  traffic  generator.  In  the  first  case,  no  control 
is  exercised  over  the  packets  being  transmitted  through  the 
network.  In  the  second  case,  control  of  the  packers  is 
possible.  The  characteristics  and  benefits  of  artificial 
traffic  generators  have  been  previously  discussed  in  this 
thesis  and  in  [Ref.  .12].  Additional  justification  for  the 
implementation  of  artifioil  traffic  generators  is  provided 
by  Tobagi  in  the  statement,  "Gene-ally,  internal  subnet 
performance  is  better  studied  in  a  controlled  traffic  envi- 
ronment rather  than  in  a  real  traffic  environment"  [Ref.  6: 
p.  1UU2]. 

To  obtain  a  thorough  performance  analysis  of  the  SFLICS 
LAS ,  this  researcher  feels  that  network  operations  personnel 
must  be  able  to  generate  known  artificial  traffic  loads  on 
the  system.  To  implement  this  capability,  it  is  propose! 
that  each  adaptor  be  able  to  function  as  an  artificial 
traffic  generator.  Process  activation,  deactivation  and 
parameter  establishment  *ould  be  controlled  by  the  central 
monitoring  site.  Additionally,  it  is  recommended  that,  for 
specified  monitoring  periods,  the  network  possess  the  capa- 
bility to  measure  every  occurence  of  certain  types  of 
events.  This  capability  is  regiired  in  order  to  create 
various  matrices  and  histograms  (e.g.,  Host  Communication 
Macrix,  and  Packet  Type  Histogram). 

The  researcher  does  lot  feel  there  exists  an  urgent 
requirement  for  an  emulation  capability.  The  composition  of 
the  SPLICE  configuration  has  been  established  and  is 
reflected  in  [Ref-  13].    ?  ossibiliti  as   for  expansion  would 
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seem  to  be  in  the  area,  of  additional  host  processing  capa- 
bility. It  is  the  opinioQ  of  this  researcher  that  the  cont- 
rolled addition  of  processing  capability  will  not  tax  the 
networks  ability  to  satisf actorally  deliver  packets.  This 
conclusion  is  based  on,  review  of  a  report  dealing  with  the 
performance  evaluation  of  the  Ethernet  local  computer 
network  [  Ref .  14  ]r  and  01  the  assumption  that  there  exists 
enough  similarity  between  the  SPLICE  LAN  and  the  Ethernet  to 
justify  a  conclusion  of  similar  performance  under  increased 
loading  conditions. 

It  is  recommended  tiat  the  ten  measurement;  reports 
discussed  in  this  Chapter  be  adopted  as  the  basis  upon  which 
the  measurement  capability  for  the  SPLICE  LAM  be  estab- 
lished. It  is  the  opinion  of  this  researcher:  that  these 
reports  provide  an  accurate  and  fairly  comprehensive  picture 
of  network  performance  which  can  be  utilized  by  operations 
personnel  in  managing  the  network.  Additional  measurement 
reports  that  could  augment  taoss  already  presented  would 
possess  the  ability  to  measure  respoise  time,  processor  and 
line  utilization,  characters  and  messages  received  in  error 
per  unit  time,  average  delay,  and  software  queue  lengths  and 
buffer  counts  such  as  in  adaptors  and  shared  resources. 

Uses  for  measurements  taken  from  a  computer  network 
include  performance  analysis,  and  coiponent  failure  identi- 
fication, isolation,  and  testing.  The  degree  of  success 
achieved  in  the  accomplishment  of  these  tasks  is  highly 
dependent  upon  a  comprehensive  and  accurate  measurement 
facility.  To  insure  this  capability  continues  to  be 
provided  throughout  the  life  of  the  network,  it  is  impera- 
tive that  the  measurement  software  incorporate  a  flexible 
design  in  order  to  accomilate  expansion  of,  and  modification 
to  the  measurement  tools. 
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IV.  NETWORK  PERFORMANCE  ANALYSIS  AND  COMPONEHr  FAILURE 

In  Chapter  2  we  presented  and  discussed  various  moni- 
toring methodologies.  The  concept  of  network  measurement 
was  then  undertaken  in  Zaapter  3.  Using  ths  knowledge 
imparted  by  these  Chapters,  we  can  qdw  discuss  the  topics  of 
network  performance  analysis  and  component  failure  handling. 
Basically,  network  performance  anaLysis  is  concerned  with 
evaluating  data  and  statistical  reports  obtained  by  the 
network's  measurement  finction.  Daring  this  evaluation 
process,  measurements  are  scrutinize!  for  signs  of  component 
failure  and  inefficient  network  functioning.  Additionally, 
performance  analysis  of  the  network  allows  us  to:  adjust 
network  performance  parameters  in  order  to  'tune'  the 
network,  plan  for  network  growth,  and  identify  bottlenecks 
at  various  components  throughout  the  system.  For  our 
discussion,  the  concept  of  failure  has  been  more  broadly 
defined  to  include  the  network's  inability  to  provide  timely 
service  to  it's  users.  if  hat  this  means  is  that  degradation 
of  selected  performance  measures,  such  as  network 
throughput,  will  be  classified  as  a  failure  within  the 
net  work. 

Initially,  a  discussion  dealing  with  the  question,  'At 
what  time  should  the  performance  ayalysis  take  place?',  is 
entered  into.  We  then  look  at  the  function  of  performance 
analysis  as  it  pertains  to  a  local  area  compater  network. 
Finally,  a  presentation  and  evaluation  of  various  techniques 
used  in  the  detection  and  diagnosis  of  network  component 
failure  is  undertaken. 
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A.   PERFORMANCE  ANALYSIS  TIMINS 

There  are  three  time  frames  ia  which  performance  anal- 
ysis can  take  place.  These  are  oi-line,  off-line,  and 
instantaneously.  Off-lias  analysis  requires  the  evaluation 
of  performance  measurements  to  talc*  place  upon  completion  of 
the  monitoring  period.  On-lins  analysis  enables  the  evalua- 
tion to  take  place  duriag  the  monitoring  period.  Evaluating 
data  at  this  time  implys  a  delay  between  the  generation  of 
the  measurements,  their  analysis,  and  subsequent  actions 
taken  as  a  result  of  this  analysis.  Instantaneous  analysis 
is  accomplished  through  the  use  of  dynamic  control  programs. 
These  programs  provide  for  the  inmsdiate  ayalysis  of  data 
and  statistical  reports,  followed  by  any  corrective  action 
that  may  be  required. 

1 .   Off -Line  Ana  ly.sis 

Off-line  analysis  implys  that  the  records  generated 
by  the  monitoring  systen  are  placed  in  mass  storage  for 
future  analysis.  Performance  analysis  is  accomplished  in 
this  way  by  the  NBSNET  [Ref.  8].  OeLay  in  corrective  action 
initiation  due  to  off-line  analysis  experienced  by  the 
NBSNET  was  5-10  minutes  [Ref.  3:  p.  726].  Implementing  -his 
•method'  of  performance  analysis  provides  the  analyst  with 
the  ability  to  obtain  an  overall  picture  of  network  perform- 
ance before  makiixj  any  otherwise  rasa  parameter  adjustments. 
This  method  also  allows  a  more  iidepth  analysis  of  the 
performance  measurements  through  the  use  of  off-line  testing 
and  evaluation  programs.  An  additional  reason  for  the  use 
of  off-line  analysis  is  based  uppn  the  speed  of  the  LAN. 
Ths  high  rate  at  which  packets  are  transmitted  means  that 
there  is  only  a  small  amount  of  time  to  simultaneously  assi- 
milate the  data  and  create  statistical  reports  upon  which  to 
act.   The  major  problem  associated  with  off-line  analysis  is 
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it's  lack  of  responsiveness .  As  a  result  of  tha  speed  of  a 
LAN,  the  environment  which  was  recorded  during  the  moni- 
toring period  may  not  exist  upon  completion  of  the  analysis. 
Therefore,  any  adjustments  to  the  parameters  based  upon  th? 
analysis  may  nc  longer  be  applicable  to  the  current 
environment. 

2 •   QSrlii  ne  Analysis 

Cn-line  performance  analysis  enables  network,  opera- 
tors to  capitalize  on  th?  benefits  offered  by  a  real  time 
computational  environment.  Althoigh  the  degree  varies, 
on-line  performance  analysis  is  currently  practiced  on  the 
Los  Alamos  Integrated  Comm mica tioas  Network  [Ref.  21],  and 
the  Lawrence  Livermore  National  Laboratory  Octopus  Network 
[Ref.  24].  Additionally,  the  3olex  Distributed  Network 
Coitrol  Systems  200  ani  330  utilize  an  on-line  approach  to 
performance  analysis  [Ref.  25].  This  'method'  of  analysis 
provides  for:  a  more  immediate  detection,  diagnosis  and 
correction  of  network  failure,  a  greater  utilization  of 
advanced  graphic  capabilities  for  monitoring  the  network, 
ana  an  increased  use  of  iecision  support  capabilities  which 
can  provide  the  operator  with  suggested  courses  of  action 
and  adjustments  to  network  performance  parameters.  Two  main 
problems  exist  with  this  approach.  First,  human  interven- 
tion is  still  required  for  the  aijustment  of  parameters 
inorder  to  modify  specific  network  performance  measures. 
And  second,  there  remains  considerable  delay,  with  respect 
to  the  speed  of  a  local  area  computer  network,  between  the 
capture  of  network  performance  measurements  and  subsequent 
action  to  effect  their  values. 
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3.   Instantaneous  Analysis 

It  is  the  researchers  opinion  that  tha  implementa- 
tion of  an  instantaneous  perf ornaice  analysis  capability 
could  theoretically  optimize  the  efficiency  and  effective- 
ness in  which  network  evaluation  is  conducted.  Networks 
that  have  implemented  or  plan  to  implement  an  instantaneous 
performance  analysis  capability  are  the  Ethernet  [  Ref .  14: 
p.  717],  and  the  Defense  Data  Batwork  [ Ref .  19:  p.  4].  This 
technique  allows  network,  operations  personnel  to  establish 
ranges  within  which  performance  measures  may  vary.  If  meas- 
ures for  which  ranges  have  been  established  breech  these 
predefined  limits  during  normal  network  operations,  an 
interrupt  can  be  generated  which  initiates  a  program 
designed  to  bring  the  value  of  the  performance  measurement 
back  within  the  prescribed  range.  Ii  this  way  we  are  taking 
maximum  advantage  of  the  computers  aoiiity  to  process  infor- 
mation almost  instantaneously  ar.d  thereby  providing  an 
immediate  response  to  current  network  conditions. 
Instantaneous  analysis  an3  dynamic  control  of  a  network  is 
no  longer  just  a  theoretical  concept.  Advanced  installa- 
tions can  now  offer  significantly  simplified  or  even 
automatic  intervention  such  as  automatic  restarts,  automatic 
remote-site  monitoring,  and  electronic  reconf iguraton 
[Ref.  20:  p.  10].  The  major  probien  with  this  technique  is 
that  there  exists  a  loss  of  explicit  control  of  the  network 
by  operations  personnel  as  the  monitoring,  performance  anal- 
ysis, and  parameter  adjustment  become  more  automated. 
Additionally,  unless  steps  are  taken  to  insure  otherwise, 
the  automation  of  these  procedures  nay  well  deprive  network 
operators  of  information  concerning  just  what  is  happening 
inside  the  network. 
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B.   LAN  PERFORKANCE  ANALYSIS 

Performance  is  the  property  of  a  system  that:  it  works, 
it  is  responsive,  and  that  it  Ls  available  [Ref.  15:  p-  1  ]. 
Given  this,  our  performance  analysis  technique  must  enable 
us  to  ascertain  that  these  characteristics  are  accomplished 
in  the  most  efficient  mianer  possible  By  implementing  a 
performance  analysis  capability,  we  hope  to  obtain  informa- 
tion that:  can  be  utilizel  to  increase  system  responsiveness 
and  reliability,  will  assist  in  capacity  planning,  and 
reduce  network  operating  costs.  Mditionally,  tracking 
network  performance  will  assist  operators  in  pinpointing 
more  precisely  the  nature  of  a  failire,  thereby  helping  to 
correct  it  quicker  and  reduce  component  downtime.  This 
section  includes  the  identification  of  those  performance 
metrics  that  have  been  salectad  by  tas  researcher  as  those 
which  can  be  most  effecti/ely  utilized  in  the  analysis  of 
SPLICE  IAN  performance.  Additionally,  a  discussion  of 
performance  parameter  Ldentif ication  and  selection  is 
undertaken. 

1  •   Performance  Measurs  utilization 

Utilizing  the  information  provided  by  the  ten 
reports  explained  in  Chapter  3,  we  are  able  to  effectively 
analyze  the  performance  of  the  LAN.  The  question  that  must 
be  answered  now  is,  'What  measures  do  we  look  at,  and  how  do 
we  combine  them  to  insure  a  complete  and  accurate  represen- 
tation cf  the  network's  performance  is  obtained?'. 

The  selection  and  combination  of  measurements  for 
the  purpose  of  network  performance  analysis  is  based  upon 
the  goals  and  objectives  of  the  pending  evaluation.  Our 
emphasis  will  be  on  using  performance  analysis  to  assist  in 
coaponent  failure  detection  ani  in  inproving  the  operational 
functioning  of  the  networ<.    For  this  to  occur,   acceptable 
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ranges  for  critical  performance  measures  under  specific 
network  loading  conditions  and  configurations  must  be  estab- 
lished. These  ranges  may  be  established  and  kept  in  tables 
by  using  analytical  models  to  dynanically  determine  these 
ranges  at  defined  intervals  for  use  in  comparison  against 
actual  measured  performance.  7alues  of  critical  performance 
measurements  which  do  aat  fall  within  established  limits 
should  cause  an  interrupt  to  be  generated  which  inturn 
initiates  some  form  of  renedial  action  on  the  part  cf  the 
system.  Finally,  in  orisr  to  maintain  explicit  control  of 
the  system,  network  operations  personnel  must  be  given  the 
ability  to  establish  and  set  the  ranges  for  these  criteria, 
and  predefine  certain  values  taken  from  the  network  as  crit- 
ical when  they  occur,  such  that  the  occurrence  will  be 
brought  immediately  to  their  attention. 

Many  possible  combinations  of  performance  measure- 
ments exist.  Metcalfe  and  Boggs  [iaf.  18:  p.  431],  utilized 
the  criteria  of:  acquisition  probability  (the  probability 
that  exactly  one  station  attempts  a  transmission  and 
acquires  the  channel),  wait  time  (the  mean  time  a  packet 
must  wait  before  successfully  acquiring  the  channel)  and 
channel  efficiency  (that  fraction  of  time  the  channel  is 
carrying  good  packets)  to  evaluate  the  performance  of  the 
Ethernet.  This  approach  is  more  of  an  experimental  nature 
and,  in  the  opinion  of  the  researcher,  seems  to  be  limited 
in  its  usefulness  in  an  operational  environment. 

Another  possible  oombination  was  suggested  by  Tobagi 
in  a  presentation  at  tie  Naval  Postgraduate  School  in 
Monterey  California  on  the  21st  of  October  1982.  One  of  the 
topics  addressed  in  that  pcesentatioi  dealt  with  identifica- 
tion and  utilization  of  performance  measures  for  a  local 
area  computer  network.  These  measures  were:  bandwidth 
utilization,  system  capacity  utilization,  and  message  delay. 
By  breaking  these  down  into   more  specific  monitoring  areas. 
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we  are  provided  with  the  ability  to  obtain  a  comprehensive 
picture  of  network  performance.  These  disaggregated 
performance  measures  fall  into  two  categories.  .  The  first 
category  provides  for  the  evaluation  of  the  networks  commu- 
nication capability  and  includes  as  criteria:  throughput  , 
response  time,  and  file  transfer  rate.  The  second  catagory 
provides  for  the  evaluation  of  resource  utilization 
throughout  the  network  ani  includes:  processor  utiiiization, 
buffer  utilization,  and  liae  utilization  [ Ref .  16:  p.  48]. 
The  combination  cf  these  ms asureraent s ,  together  with  control 
of  the  parameters  which  effect  their  values,  enable  network 
operations  personnel  or  dynamic  control  prograus  to  detect 
degrading  network  perfomance  and  take  appropriate  correc- 
tive action. 

2-   B^rformance  Paraatter  Selection 

Following  the  selection  of  appropriate  network 
performance  measures,  parameters  must  be  identified  which 
can  be  adjusted  in  order  to  affect  the  values  taken  on  by 
these  measurements.  Thess  parameters  and  their  associated 
values  should  be  chosen  on  the  basis  of  existing  analytical 
and  simulation  results  as  well  as  previous  experiments 
carried  out  in  varied  traffic  conditions  [Ref.  22].  The 
researcher  feels  that  once  the  parameters  that  affect  the 
value  of  a  specific  performance  neasure  have  been  identi- 
fied, they  should  be  ?c ioritizsd.  This  prioritization 
should  be  based  upon  the  parameters  effect  en  the  perform- 
ance measurement  for  a  given  systen  configuration.  This 
suggests  that  there  may  =xist  different  prioritizations  of 
the  parameters  for  different  configurations  of  the  network. 
One  possible  prioritization  scheme  might  call  for  the 
adjustment  of  those  parameters  first  which  have  the  greatest 
effect  on  the  value  of  tie  performance  measurement.  kn 
important   fact  that   should  be   considered  when   selecting. 
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prioritizing,  and  adjusting  paramatars  is  that  the  majority 
of  performance  measuremants  and  paranatars  are  interrelated. 
Foe  example,  an  adjustment  made  to  increase  throughput,  such 
as  increasing  the  size  of  the  packet  data  field,  will  also 
effect  the   delay   experianoad   by    tha    network   user. 

Finally,  there  are  many  parameters  which  can  affect 
ona  performance  measurement .  Likewise,  the  adjustment  of 
or.a  parameter  is  capable  of  affecting  nany  performance  meas- 
uras.  This  being  the  oasa,  it  would  be  extremely  difficult 
at  this  time  to  attempt  a  listing  of  all  those  parameters 
which  affect  the  valuas  of  the  parformance  measures  wa 
presented  above.  Rather,       these    would    be      more    accurately 

identified  through  the  usa  of  simulation,  acdaling,  and 
experimentation.  In  ganaral,  ona  cannot  identify  a  single 
tunable  parameter  which  directly  affects  one  specific 
performance  characterizing  measura.  Instead,  one  can  iden- 
tify the  two  sets  (parameters  and  neasures),  and  through 
experimentaton,     define  thair   intersactions   [ Ref .    26:    p.    1]. 

C.       COHPOHENT    FAILURE 

Along  with  managing  the  local  area  net*or k-ionghaul 
network  interface,  componant  failure  detection  and  diagnosis 
is  seen  as  tha  most  important  function  of  network  manage- 
ment. The  ability  to  provida  usars  with  a  responsive, 
available  network  is  of  primary  importance.  To  do  this,  we 
must  be  able  to  guickly  datect  and  dlagnos  network  failures. 
Having  this  capability  will  allow  taa  system  to  immediatly 
initiate  appropriate  recovery  procedures  and  rastora  full 
service  to  it's  users.  Tils  saction  will  address  the  topics 
of  component  failure  detaction,  failure  diagnosis,  and 
reporting   of  the   failure   throughout    the    network. 
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1 •   Failure  Detection 

A  failure  detection  function  should  enable  network 
operators  to  recognize  operating  and  configuration  problems 
immediately  so  they  can  intervene  in  a  timely  fashion  to 
correct  them  [Ref*  20:  p.  10].  Not  only  do  we  want  to  be 
made  aware  of  catastrophic  failures,  but  also  of  gradually 
failing  conditions.  It  is  a  well  designed  performance  anal- 
ysis capability  that  enables  us  to  be  aware  of  the  latter. 

Component  failure  detection  within  a  local  computer 
network  can  occur  in  many  ways.  Probably  ths  most  simple 
being  a  face  to  face  encounter  between  a  usee  and  network 
operator,  the  subject  of  discussion  being  either  an  inoper- 
able component  or  unsatisfactory  network  service.  A  phone 
call  from  a  remote  user  is  another  method  of  detection.  We 
can  also  see  a  network  operator  laboriously  reviewing  system 
statistic  reports  for  signs  of  degrading  performance.  From 
these  passive  monitoring  techniques  which  required  extensive 
operator  intervention,  tha  emphasis  has  shifted  and  is  now 
on  automatic  alerts  based  on  equipment  failures,  and  in  more 
sophisticated  applications,  also  on  user-defined  limits  on 
such  items  as  transmission  voiinss  and  response  time 
[Ref.  20:  p.  10].  Implementation  of  an  automatic  failure 
detection  capability  is  currently  being  planned  for  the 
Defense  Data  Network  [Ref.  19:  p.  7]. 

It  is  not  the  researcher's  intent  to  suggest  that 
all,  or  any  subset  of  the  detection  methods  to  be  presented 
below  shculd  be  automated.  Rather,  the  author's  approach 
will  be  to  identify  and  discuss  possible  techniques  of 
failure  detection,  understanding  tut  their  implementation 
could  take  a  variety  of  forms. 
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a.  Maintenance   Detection 

Failures  can  be  identified  through  normal 
network  maintenance  activities.  Iiese  activities  may  be 
under  operator  or  prograa  control  and  may  occur  at  prede- 
fined intervals  or  on  an  as  needed  basis.  For  example,  in 
the  process  of  updating  tie  oonf igiration  data  base,  the 
nead  may  arise  to  poll  ill  components  within  the  network. 
No  resonse  from  a  particular  element  may  indicate  the  exis- 
tance  of  a  failure.  Additionally,  the  failure  to  receive  a 
required  maintenance  or  status  report  from  a  component  is 
another  indication  that  a  problem  nay  exist.  Testing  the 
operation  of  the  network  utilizing  artificial  traffic  gener- 
ation may  also  lead  to  the  discovery  of  network 
inefficiencies.  Finally,  the  use  of  watchdog  packets 
(Ref.  8:  p.  727]  to  verify  active  and  inactive  components  is 
also  a  viable  tool  that  oan  be  used  in  identifying  failing 
elements. 

b.  Performance  Analysis  Detection 

It  is  the  researcher's  opinion  that  the  major 
benefit  to  be  gained  from  the  performance  analysis  of  a  LAN, 
is  the  added  capability  it  givas  network  operations 
personnel  in  detecting  failed  components.  Status  reports 
generated  by  individual  components,  and  those  created  by  the 
central  monitoring  site  oan  be  reviewed  for:  changes  of 
state,  obvious  trends,  and  erratio  component  performance. 
Additionally,  component  error  counts  can  be  reviewed  for 
degrading  conditions.  Two  systems  which  use  approaches 
similar  to  these  are  SNA  ;aef.  27:  ?.  12],  and  the  Arpanet 
NC3  [Ref.  23:  p.  6-6].  In  S  tU ,  Record  Maintenance 
Statistics  are  generated  periodically  and  sent  to  the 
coatroi  point  where  they  are  logged  and  scanned  to  detect 
degrading  component   performance.    En   the  Arpanet,    IMP'S 
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examine  their  own  status  and  send  raports  to  the  NCC  every 
miaute.  Finally,  by  monitoring  tha  availability  of  a  compo- 
nent, which  is  defined  as  the  mean  time  between  failure 
(MTFB)  divided  by  (the  MFBF  plus  the  mean  time  to  repair 
(MTTR)),  we  are  able  to  datect  a  vary  gradual  dagradation  of 
that  components  ability  to  perform  it's  function  over  an 
extended  period  of  time. 

c.   Localized  Detaotion 

Detection  of  a  failure  within  a  component  can  bs 
accomplished  by  the  component  itself,  assuming  the  failure 
is  not  a  catastrophic  one.  A  trap  mechanism  within  an 
adaptor  or  component  interface  is  a  'device'  which  is  acti- 
vated whenever  a  certaiQ  iardware  failure  occures  or  a  block 
of  code  is  executed.  This  mechanisn  not  only  detects  the 
problem,  but  can  used  to  iiitiate  aone  type  of  diagnostic  or 
corrective  action.  Hardware  devioas  are  also  used  for 
problem  detection  at  local,  levels.  The  Arpanet  IMP  hardware 
is  capable  of  automatically  datacting  power  failures 
[Ref.  23:  p.  6-5],  while  the  Ethacnet  employs  a  watchdog 
tiaer  which  disconnects  the  transceiver  from  the  channel  if 
it  starts  acting  suspicioisly  * Ref .  13:  p.  20].  One  final 
method  of  local  failure  datection  is  accomplished  by  estab- 
lishing a  maximum  aunser  of  retrys  for  a  packet 
transmission.  After  a  laximum  of  15  retrys  to  transmit  a 
packet,  a  transmitter  on  the  E-chanat  gives  up  and  reports 
the  failure  condition  [Ref.  17:  p.  2D]. 

Detecting  the  failure  of  an  adaptor's  attached 
component  and  any  peripherals  associated  with  it  is  also  a 
requirement.  Detecting  the  failure  of  an  adaptors  attached 
coaponent  can  be  accomplished  throujh  the  usa  of  an  inac- 
tivity timer.  The  purpose  of  this  timer  is  to  signal  the 
possibility  that  the  attached  conponent  may  have  failed. 
Once  the  timer  runs  down,   action  is  initiated  to  verify  the 
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status  of  the  component.  If  test  rssults  indicate  that  the 
component  is  down,  the  central  monitoring  site  is  notified. 
Finally,  it  is  assumed  that  failure  detection  of  peripherals 
attached  to  the  network  component  will  be  accomplished  by 
that  component.  However,  the  reqiirement  exists  that  the 
status  of  these  peripherals  be  accessible  to  local  failure 
detection  routines  inordec  that  the  central  monitoring  site 
may  be  kept  aware  of  their  condition. 

d.   Neighbor  Detection 

If  a  node  experiences  a  catastrophic  failure 
without  being  able  to  notify  the  ceitral  monitoring  site  of 
the  impending  doom,  then  we  must  havs  a  method  by  which  this 
failure  can  be  detected  and  the  central  monitoring  site 
notified.  At  the  local  level  (.i.e.  without  assistance  from 
the  central  monitor)  there  exist  2  possibilities,  both  of 
which  are  based  on  the  assumption  tiat  the  failed  node  was 
involved  in  a  session  when  the  failire  occured,  or,  that 
some  other  node  will  attempt  to  initiate  a  session  with  the 
failed  r.cde  within  a  reasonable  amount  of  tine  after  the 
failure.  Assuming  the  aode  in  question  is  involved  in  a 
session,  there  exist  two  methods  of  detection.  The  first 
method  involves  the  maximum  number  of  times  a  packet  will  be 
transmitted  without  receiving  an  acknowledgement.  If, 
during  a  session,  a  packet  is  successfully  transmitted  the 
maximum  number  of  times  without  receiving  an  acknowledge- 
ment, then  the  transmitting  station  can  assume  the 
destination  is  down  and  notify  the  oentral  monitoring  site. 
Similarly,  if  the  destination  node  stops  receiving  packets 
from  the  source  node  withojt  getting  an  end  of  message  indi- 
cation, it  can  assume  the  source  has  failed  and  notify  the 
monitoring  site.  Finally,  the  technique  based  on  nonreceipt 
of  acknowledgements  can  also  be  ussi  when  one  station  is 
attempting  to  establish  a  session  with  another  station. 
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2  -   Failure  Diagnosis 

Once  a  failure  has  been  detected,  it  must  be 
located,  and  it's  cause  determined.  These  airs  the  primary 
objectives  of  a  failure  diagnosis  fmction.  This  function 
may  be  automated  such  that  the  detection  of  a  failure  initi- 
ates a  program  which  pecfroms  various  diagnostic  routines  in 
support  of  the  accomplishment  of  these  objectives.  The 
Defense  Data  Network  utilizes  an  approach  similar  to  this. 
As  planned,  the  DDN  Monitoring  Centers  will  be  capable  of 
automatically  monitoring  network  elements  to  identify, 
isolate,  and  sometimes  correct  probLetns  without  specialized 
maintenance  personnel  involvement. 

When  designing  a  set  of  diagnostic  tools  it  should 
be  noted  that,  f cr  some  diagnostic  tests  and  routines,  moni- 
toring and  normal  data  traffic  fLow  may  be  suspended. 
Assuming  this  to  be  the  rule  rathec  than  the  exception, 
diagnostic  tools  should  be  develops!  accordingly.  In  the 
following  sections  we  will  identify  and  discuss  a  number  of 
these  tools. 

a.   Tests  and  Traps 

Individual  diagnostic  programs  can  be  utilized 
to  initiate  specific  tests  in  areas  :>f  the  failed  component. 
Additionally,  traps  can  be  utilized  to  activate  these 
programs  once  a  failure  has  been  detected.  Tests  conducted 
on  the  component  might  inciude  checking  all  physical  connec- 
tions the  component  has  with  other  devices  and  comparing  a 
block  of  code  in  the  cojmoonent  with  an  imagae  of  what  that 
code  should  be. 
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b.  Interface  Looping 

A  good  diagnostic  tool  for  a  network  interface 
is  the  ability  cf  a  node  to  send  packets  to  itself.  In 
giving  a  node  the  ability  to  transmit  and  simultaneously 
receive  the  transmitted  packets,  *e  are  able  to  obtain 
complete  verification  of  the  network  interface.  This  is 
where  our  artificial  traffic  generatDr  comes  into  use.  Sa 
can  generate  a  stream  of  packets  with  known  content,  and 
size  and  arrival  distributions.  By  checking  the  returning 
traffic  against  what  was  just  generated,  we  can  identify  any 
problems  which  may  exist  La  our  network  interface. 

c.  Dynamic  Diagnostic  Tool  (DDT) 

The  use  of  a  Dynamic  Diagnostic  (or  Debugging) 
Tool  was  introduced  by  ths  Arpanet  JCC  [Ref.  23:  p.  6-5]. 
The  DDT  is  a  set  of  software  proguans  which  are  utilized  in 
an  effort  to  diagnos  the  cause  of  a  component  failure.  The 
DDT  may  be  local  to,  or  transmitted  to  the  machine  associ- 
ated with  the  failed  component.  DDT  can  be  used  to  perform 
a  number  of  tests  and  operations  geared  towards  determining 
the  cause  of  a  component  failure,  thase  include:  the  exami- 
nation and  modification  of  a  spaoific  word  in  memory, 
clearing  an  antire  blocic  of  memory,  searching  memory  for  a 
particular  stored  value,  examining  tie  contents  of  specific 
buffers  and  modifying  thsir  contents,  measurement  of  a 
device's  realtime  clock,  and  implanting  -raps  and  interrupt 
handlers  in  a  device  suspected  of  having  software  or  hard- 
ware problems. 

d.  Dump   and    Load 

If  all  other  diagnostic  methods  fail  to  deter- 
mine the  cause  of  a  failure,  one  final  course  of  action 
exists.         The    entire    contents   of    mail    aeraory    existing    within 
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the  component  at  the  tins  of  failara  is  dumped  to  off-line 
storage  where  additional  diagnosis  and  analysis  can  be 
conducted.  Simultaneously,  a  new  copy  of  the  appropriate 
software  is  loaded  into  tie  component.  If  this  procedure 
still  fails  to  correct  the  problem  or  bring  the  device  back 
on-line,  it  can  te  assumed,  with  a  high  degree  of  certainty, 
that  a  hardware  problem  sxists  and  contact  of  appropriate 
vendor  personnel  is  in  orisr. 

3.   Faj.lu.jig  Not! f ication 

We  now  address  the  question,  'Who  is  notified  and 
what  data  bases  are  updated  upon  the  detection  of  a  failed 
component?1.  Assuming  detection  ani  diagnosis  were  accom- 
plished by  a  distributed  component  (relative  to  the 
monitoring  site)  in  the  network,  the  central  monitoring  site 
should  be  the  '  first  entity  notified  of  the  failure. 
Realistically,  notification  of  the  various  entities  to  be 
identified  below,  could  hi? pen  simultaneously,  or  nearly  so. 
It  would  then  be  the  responsibility  of  the  central  moni- 
toring site  to  notify  additional  entities  and  to  update  the 
appropriate  data  bases.  These  data  bases  are  updated  by  the 
central  monitoring  site  in  basically  two  ways,  either  by 
operatons  personnel  or  oy  a  program  which  automatically 
makes  entries  into  the  appropriate  lata  bases  upon  receipt 
of  failure  alert  messages.  The  data  bases  that  must  be 
updated  include:  the  configuration  lata  base,  the  problem 
management  data  base,  ani  a  historical  data  base  which  is 
utilized  as  a  means  through  which  the  evolution  cf  the 
network  can  be  tracked. 

There  are  a  number  of  additional  entities  which  must 
also  be  notified  upon  the  detection  of  a  failed  component. 
To  begin  with,  if  monitoring  site  personnel  were  unable  to 
restore  the  failed  component,  then  the  appropriate  vendor 
must  be  contacted.    The  rest  of  the  LAN  will  be  notified  of 
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tha  failure  by  the  probleu  management  data  bass  or  configu- 
ration management  data  base  when  thay  log  onto  the  network, 
or  when  they  attempt  to  utilize  the  resources  normally 
provided  by  that  component-  Users  attempting  to  utilize  tha 
resources  of  the  failsd  component  from  a  geographically 
dispersed  site  through  tie  DDN  if  111  be  notified  of  tha 
failure  in  a  manner  analogous  to  local  users  once  they  have 
made  contact  with  the  LAN.  Finally,  those  members  of  the 
operations  staff  who  may  02  in  the  process  of  conducting  any 
type  of  experiments  cr  monitoring  activities  which  include 
the  failed  component  must  be  explicitly  notified  of  the 
configuration  change. 

D.   CHAPTER  SUMMARY 

We  began  this  Chapter  with  a  discussion  of  the  possible 
tine  frames  in  which  network  performance  analysis  could 
occure.  Those  discussei  were:  off-line,  on-line,  and 
instantaneous  analysis.  The  topic  of  local  area  network 
performance  analysis  was  fca en 'entered  into.  In  this  section 
we  discussed  performance  neasure  utilization  and  performance 
parameter  selection.  A  presentation  of  various  methods  of 
component  failure  detection  and  diagnosis  concluded  the  body 
of  the  Chapter. 

It  is  the  researcher's  opinion  that  a  SPLICE  LAN  could 
benefit  from  each  type  of  analysis.  Instantaneous  analysis 
could  be  utilized  to  evaluate  and  effect  the  performance  of 
the  network  layer  protocnL  and  bel:>w.  This  would  reduce 
management  overhead  in  that  personnel,  would  not  be  needed  to 
constantly  monitor  network  status  via  a  CRT,  or  to  review 
and  analyze  printouts  reflecting  the  networks  condition. 
For  example,  adjustments  aade  to  increase  throughput  during 
times  of  network  congestion,  such  as  modifying  our  backoff 
technique,   would  be   accomplished  bf    a  program   rather  than 
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requiring  human  intervention.  Dverhead  costs  associated 
with  the  running  of  these  programs  would  have  to  be  compared 
against  the  costs  incurred  by  non-automated  and  semi- 
automatic procedures  in  order  that  efficient  and  cost 
effective  functional  implementation  is  achieved.  An  on-line 
analysis  capability  would  give  the  operator  a  window  through 
which  the  functioning  of  the  network  could  be  observed. 
Through  the  use  cf  some  sort  of  decision  support  system,  the 
operator  could  obtain  assistance,  possibly  in  the  form  of 
suggested  action  or  in  ai justing  parameters  which  effect 
global  performance  measures.  Dff-line  analysis  would 
provide  operators  with  tus  ability  to  analyze  the  perform- 
anre  of  the  network  in  an  environment  seperate  from  the 
system.  This  • method'  would  remove  any  pressure  that  might 
be  experienced  by  the  operator  when  attempting  to  analyze 
performance  while  on-line. 

The  performance  measures  suggested  by  the  aithor  for  the 
SPLICZ  LAN  are  seperated  into  *  two  oatagories.  The  first 
cacagcry  provides  for  the  evaluation  of  the  network's  commu- 
nication capability.  The  second  oatagory  includes  measures 
which  can  be  used  to  evaluate  resource  utilization 
throughout  the  network.  Each  of  these  was  described  in 
detail  earlier  in  the  Chapter.  Saages  for  these  measures 
should  be  determined  dynamically  during  network  operation, 
however,  network  operations  personnel  must  be  able  to  over- 
ride dynamic  range  establishment  aid  set  their  own  range 
values  as  needed.  Numerous  parameters  exist  which  can  be 
utilized  to  effect  the  values  of  these  performance  measures. 
Rather  than  proposing  a  list  of  tunable  parameters,  that,  by 
it's  very  nature  would  oe  incomplete,  the  author  offers 
three  suggestions  for  their  identification  and  utilization. 
These  parameters  and  their  associated  values  should  be 
established  on  the  basis  of  existing  analytical  models  and 
simulation  results   as  weLL  as   operational  experimentation. 
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Onoe  identified,  these  parameters  should  be  prioritized  in  a 
manner  which  reflects  their  effect  Da  specific  performance 
measurements.  Finally,  the  fact  that  adjustment  of  one 
parameter  may  effect  the  value  of  more  than  oae  performance 
measure  must  be  considered  in  selection  and  implementation 
of  the  parameter. 

Our  discussion  of  a  failure  detection  and  diagnosis 
capability  as  part  of  the  SPLICE  LAN  will  emphasis  the 
limiting  of  these  capaoilities  possessed  by  components 
distributed  throughout  the  network.  The  failure  detection 
capability  of  a  distributed  component  is  limited  to  the 
identification  of  those  failures  whioh  cannot  be  detected  by 
the  central  monitoring  site.  The  diagnosis  oapability  is 
also  to  be  similarly  restricted.  It  is  the  researcher's 
opinion  that  this  approach  will  redioe  diagnostic  software 
duplication  throughout  the  network,  eliminate  maintenance  on 
distributed  diagnostic  tools,  and  provide  for  more  central 
control  of  failure  analysis  and  prooiem  management.  Upon 
detecting  a  failure,  the  component  will  send  some  form  of 
problem  alert  message  to  the  central  monitoring  site.  From 
that  point,  the  actions  taken  by  the  monitoring  site  are 
identical  to  those  that  it  would  take  if  it  had  detected  the 
failure.  Since  we  have  limited  the  detection  and  diagnosis 
capability  of  the  distrioated  compoaents,  conversely,  we 
must  increase  these  of  the  central  nonitoring  site.  It  is 
felt  that  periodic  status  reports  suoh  as  those  described  in 
Chapter  3  should  be  sent  co  the  central  monitoring  site  on  a 
regular  basis  .  There  they  can  be  analyzed  for  possible 
signs  of  component  failure  and  systen  degradation.  In  addi- 
tion to  those  capabilities  alluded  to  above,  the  monitoring 
site  must  be  able  to  direct  the  transmission  of  status 
reports  from  distributed  network  components  to  itself.  It 
must  possess  a  diagnostic  tool  that  can  be  utilized 
throughout  the  network  to  isolate  and  identify  failures  in  a 
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manner  similar  to  that  of  the  DDT  smployed  by  the  .\rpanet 
Network  Control  Center.  lore  details  concerning  the  func- 
tioning cf  a  central  monitoriag  site  will  be  discussed  in 
Chapter  6.  It  is  sufficient  to  conclude  at  this  time  that, 
by  centralizing  the  majority  of  the  network's  failure  detec- 
tion and  diagnositic  capabilities,  we  are  increasing  control 
of  the  failure  handling  procedures  of  the  network,  reducing 
software  duplication  and  na intenanoe,  and  minimizing  costs 
associated  with  the  implin enta tion  of  a  failure  detection 
and  diagnostic  function. 
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V.  MANAGING  £1-1  kAN^DDN  INTERFACE 

As  stated  in  Chapter  4,  along  with  failure  identifica- 
tion and  correction,  the  most  important  function  of  LAN 
management  is  the  monitoring  and  control  of  the  local  area 
network  to  long  haul  network  interface.  This  function  is 
primarily  concerned  with  regulating  -he  flow  of  packets 
between  the  networks  and  any  other  tasks  which  support  it's 
accomplishment.  In  this  Chapter,  our  emphasis  will  be  on 
identifying  and  discussing  the  managerial  aspects  associated 
with  the  interconnection  of  two  networks. 

A  fundamental  aspect  of  internetwork  communication  is 
the  establishment  of  agreed  upon  conventions.  Communicating 
entities  must  share  some  physical  transmission  medium  and 
they  must  use  common  conventions  or  agreed  upon  translation 
methods  [Ref.  29:  p.  1392]-  This  required  coumonality  can 
be  achieved  in  a  number  of  ways.  Protocols  of  one  net  can 
be  translated  into  those  of  another,  or,  connon  protocols 
could  be  defined.  Another  method  through  which  commonality 
may  be  achieved  calls  for  conversion  to  a  standard  interface 
by  all  networks.  It  is  the  researcher's  opinion  that  the 
connection  of  long  haul  networks  to  Local  area  networks  does 
net  lend  itself  to  the  establishient  of  common  protocols 
that  would  be  efficient  for  both  networks.  Additionally, 
the  benefit  to  be  derived  from  converting  to  a  standard 
interface  is  only  realized  if  a  network  is  connected  to  more 
than  one  other  network.  If  connected  to  only  one  network, 
utilizing  a  standard  interface  would  require  two  protocol 
translations.  Network  A' s  protocol  would  require  transla- 
tion inxo  a  standard  interface  protocol  which  would  then 
require  translation  into  network  B's  protocol  upon  arriving 
at  the   connected  network.    Whereas,    the  use   of  protocol 
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translation  would  only  require  the  conversion  of  A*s 
protocol  to  B's,  or  vice  /ersa,  depeiding  upon  the  direction 
of  packet  flow.  Theref ora ,  we  will  consider  the  issue  of 
managing  the  interface  between  two  artworks  from  the  stand- 
point of  protocol  conversion,  rather  than  from  common 
protocol  or  standard  interface  establishment. 

There  are  many  differences  which  exist  between  networks 
that  must  be  resolved,  those  that  will  be  covered  in  detail 
in  this  Chapter  include:  naming  aid  addressing,  flow  and 
congestion  control,  paccet  size,  and  access  control. 
Additional  areas  to  be  discussed  will  encompass  gateway 
configuration,  internetwork  accounting,  and  dispersal  of 
network  status  information. 

A.   GATEWAY  CONFIGURATION 

In  an  effort  to  set  the  stage  for  a  discussion  dealing 
with  the  management  of  an  interface  between  two  networks,  it 
is  felt  that  an  understanding  of  possible  gateway  configura- 
tions, or  levels  of  interconnection  as  duboed  by  Cerf 
[Ref.  29:  p.  1392],  will  prove  beneficial.  There  are  a 
variety  of  different  ways  in  which  the  gateway  between  two 
packet  switched  networks  may  be  configured  [Ref.  28:  p. 
4-49].  We  will  briefly  describe  each  one  and  discuss  why  it 
should,  or  why  it  should  not  be  coisidered  for  the  SPLICE 
network.  Finally,  for  explanitory  purposes,  we  will  select 
a  configuration  and  use  it  as  an  example  throughout  the 
Chapter. 

Utilizing  a  common  host  is  a  simple  and  very  straight 
forward  approach  that  can  be  used  to  connect  two  networks. 
This  method  connects  two  networks  through  a  host  that  is 
attached  to  the  two  networks.  This  configuration  can  be 
ruled  out  immediately  from  consideration  for  the  SPLICE 
network.   This  is  because  the  entire  SPLICE  program  is  based 
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upon  relieving  the  host(3>  of  comnunication  responsibili- 
ties. To  burden  the  host  computer  with  anything  but  the 
processing  of  application  programs  would  be  entirely  against 
the  SPLICE  concept. 

Another  approach  to  interconnecting  packet  switching 
networks  would  be  to  have  a  switching  node  which  is  common 
to  both  of  them.  This  nethod  must  also  be  ruled  out  from 
consideration.  First  of  ill  ,  the  LAN  does  not  possess  a 
switching  node.  An  attempt  night  be  made  to  combine  the 
functions  of  a  DDN  switching  node  and  the  L&N  front  end 
processor (FEP) .  Although  a  technically  feasible  solution, 
the  drawbacks  are  major  ani  numerous. 

An  internode  device  can  be  used  as  a  separate  entity  to 
perform  only  gateway  functions  between  each  of  the  networks 
to  be  interconnected.  This  gateway  is  normally  designed  to 
appear  as  a  special  host  to  each  network.  This  approach 
provides  the  most  acceptaole  alternative,  however  it  is  the 
author's  opinion  that  the  requirement  for  additional  hard- 
ware to  perform  the  interconnection  of  two  networks  is  not 
supportive  of  the  SPLICE  concept. 

The  final  possibility  for  a  gateway  configuration 
utilizes  rhe  existing  capabilities  3f  a  DDN  switching  node 
(ISP)  ,  and  the  local  area  network  FSP.  This  configuration 
is  called  the  "two  half-gateway".  In  the  "two  half-gateway" 
approach,  a  gateway  is  coiposed  of  two  halves,  each  associ- 
ated with  it's  own  network.  Each  haLf-gateway  would  only  be 
responsible  for  translating  between  the  internal  packet 
format  of  it's  own  network  and  ssae  common  internetwork 
format.  The  number  and  different  types  of  networks  the  DDN 
ties  into  will  dictate  whether  or  not  an  approach  of  this 
nature  is  optimum.  For  the  time  being,  no  standard  inter- 
network format  has  been  proposed.  This  being  the  case,  a 
slight  modification  to  this  approaci  should  make  it  usable 
and  efficient  for  connecting  a   SPLI3E  local  area  network  to 
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the  DDN.  This  change  would  requirs  a  conversion  from  the 
internal  protocol  (s)  of  the  local,  area  network  to  the 
protocol (s)  of  the  Defense  Data  Nstwork  ani  vice  versa, 
depending  on  the  direction  the  packets  are  flowing. 

For  the  remainder  of  tils  Chapter,  we  will  utilize  the 
"two  half-gateway"  as  our  basis  for  explaining  the  differ- 
ences that  must  be  overcome  when  connecting  two  networks, 
and  the  functions  which  must  bs  accomplished  by  the  gateway. 
For  our  discussion,  it  may  help  to  picture  ens  half  of  the 
gateway  implemented  in  ths  LAS  FEP,  and  the  other  half  of 
the  gateway  resident  in  a  DDN  switching  nods  which,  in 
coajuction  with  the  LAN  PEP,  allows  communication  between 
ths  two  networks  to  be  achieved.  Finally,  a  number  of 
assumptions  have  been  mads,  which  ars  felt  will  add  clarifi- 
cation to  concepts  discusssd,  and  provide  a  basis  upon  which 
analysis  can  bs  conducted  and  proposals  made. 

1)  The  LAN  cannot  affect  the  spesi  it  which  packets  transit 
the  DDN. 

2)  The  LAN  FEP  cannot  increase  the  rate  at  which  packets  are 
sent  to  it  from  the  switching  node  past  the  maximum 
transmission  rate  of  tiat  node, 

3)  The  switching  node  that  the  LAN  ties  into  ma/  also  act  as 
an  IMP  through  which  other  hosts,  not  part  of  the  LAN, 
access  the  DDN. 

4)  Error  control,  flow  control,  and  duplicate  packet  detec- 
tion is  provided  for  communication  between  the  LAN  FEP 
and  the  DDN  switching  node  by  ons  of  the  network  access 
protocols  supported  by  the  DDN.  In  this  situation,  the 
switching  node  merely  views  ths  front  end  processor  as 
another  host. 
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B.   PACKET  SIZING 

The  problem  of  differences  in  packet  size  is  basically 
one  of  coping  with  the  ?r agmentatiDn  that  mast  inevitably 
occur  when  the  two  interconnected  nstworks  employ  different 
internal  maximum  packet  sizes  * Ref .  28:  p.  4-49].  Two  situ- 
ations may  exist,  one  is  when  tha  maximum  packet  size  for 
the  LAN  is  greater  than  that  of  the  Long  haul  network  (LHN) , 
the  other  being  when  the  maximum  packet  size  for  the  long 
haul  network  is  greater  than  the  maximum  packet  size  for  the 
local  area  network. 

The  first  case,  when  LAN  maximun  packet  size  is  greater 
than  the  long  haul  network  maximun  packet  size,  can  be 
handled  in  one  of  three  ways.  First,  if  the  packet  to  be 
transmitted  from  the  LiN  to  LHN  is  smaller  than  the  LO 
maximum  packet  size  by  at  least  the  number  of  additional 
overhead  bytes  that  will  be  added  on  by  the  packet  switching 
node  once  the  packet  reaches  the  DDN,  then  the  packet 
requires  no  size  modification  before  being  sent  to  the 
switching  node.  Second,  if  the  paccet  to  be  transmitted  is 
larger  than  the  LHN  maximum  packet  size,  we  may  fragment  the 
packet  appropriately  in  tie  FE?.  Each  packet  would  be  frag- 
mented such  that  the  new  packets  would  be  smaller  than  the 
maximum  packet  size  for  the  LHN  eiren  after  the  overhead 
bytes  were  added  by  the  L3N  switching  node.  A  number  of 
problems  exist  with  this  approach  which  include,  a  require- 
ment for  increased  software  capabilities  at  the  FEP, 
additional  delay  experienced  by  packsts  wanting  to  leave  the 
network,  and  the  possibility  that  resequencing  of  all  the 
packets  making  up  the  message  being  sent  may  be  required  due 
to  the  insertion  of  a  "new"  packet  into  the  sequential 
series  cf  packets  that  have  been  transmitted  from  somewhere 
in  the  LAN.  And  finally,  if  the  packet  to  be  transmitted  is 
larger  than   the  LHN   maximum  packet  size,    we  may   just  go 
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ahead  and  send  the  packet  to  the  switching  node.  We  are 
able  tc  do  this  because  the  DOD  Standard  Internet  Protocol, 
which  will  be  implements!  by  the  Defense  Data  Network, 
provides  for  a  fragmentation/reassembly  service.  It  is 
envisioned  that  the  "ever-sized"  packet  would  be  fragmented 
with  each  piece  being  sent  to  the  destination  switching  node 
where  the  fragments  would  be  reassembled  bask  into  the 
"over-sized"  packet.. 

In  addressing  the  sscn  i  case,  where  LHN  maximum  packer 
size  is  greater  than  LAN  maximum  packet  size,  we  assume  that 
the  fragmentation  of  a  smaller  LAN  packet  to  help  fill  up  a 
partially  filled  larger  LHN  packet  will  not  occur.  In  this 
situation,  the  main  concern  Df  the  LAN  is  that  it  might 
receive  a  packet  from  the  DDS  which  is  larger  than  it's 
maximum  packet  size.  This  being  the  case,  the  LAN  F2P  must 
possess  the  capability  to  fragment  the  larger  packet  into 
packets  suitable  for  transaission  on  the  local  area  network. 

C.   CONGESTION  CONTROL 

Assuming  probabilistic  message  generation  and  fixed 
capacity  in  network  components,  overLoad  would  be  inevitable 
without  certain  mechanisms  to  stop,  slow  down  or  absorb  the 
rate  of  message  arrival.  The  basic  tool  utilized  in  the 
accomplishment  of  these  tasks  is  congestion  control. 
Congestion  control  can  be  defined  as  a  procedure  whereby 
distributed  network  resources,  sucii  as  channel  bandwidth, 
buffer  capacity,  and  CPJ  capacity  are  protected  from  over 
subscription  by  all  sources  of  network  traffic  [ Ref .  29:  p. 
1430].  Congestion  is  most  likely  to  be  visible  at  a  gateway 
connecting  a  local  area  network  to  a  long  haul  network.  In 
scue  cases,  the  transmission  rates  of  LAN's  Bight  exceed 
those  of  long  haul  networks  by  factors  of  30-100  or  more 
[Ref.  29:   p.   1400].     Tiere  are  basically  two   schools  of 


73 


thought  when  it  comes  to  dealing  with  the  problem  of 
congestion  control.  rhere  are  those  who  advocate  rigidly 
controlling  the  input  of  packets  into  a  network  and  explic- 
itly rule  out  the  discarding  of  packets  as  a  means  of 
congestion  control.  And  conversly,  there  ace  some  who 
promote  the  dropping  of  packets  as  the  sole  means  of 
controlling  congestion  PRef.  29:  p.  1400].  We  will  look  at 
congestion  and  flow  control  at  tka  interfacs  between  two 
networks  from  both  of  these  viewpoints.  It  is  the  authorfs 
intention  to  propose  and  discuss  techniques  of  congestion 
and  flow  control  for  receipt  ot  packets  from  the  LAN  and  for 
receipt  of  packets  from  tie  DDN  by  the  half  of  the  gateway 
resident  in  the  LAN  FEP. 

1«   LAN  to  LHN  Packet  Control 

The  author  has  concluded  that  there  exist  numerous 
methods  of  congestion  control,  many  of  which  have  yet  to  be 
identified.  The  discussion  which  foLlows  includes  the  pres- 
entation of  three  possible  methods  of  gateway  congestion  and 
flow  control.  These  methods  deal  with  the  handling  of 
packets  received  from  the  LAM  by  the  front  end  processor 
destined  for  the  DDN. 

The  simplist  method  of  congestion  control  provides 
for  the  immediate  transmission  of  packets  to  the  DDN.  If 
the  gateway  portion  of  the  FEP,  in  conjunction  with  it's 
associated  switching  noie,  is  able  to  successf ally  transmit 
packets  to  the  DDN  faster  than  they  arrive  from  the  LAN, 
then  we  can  assume  the  requirement  for  congestion  and  flow 
control  is  minimal  in  that  direction.  However,  the  author 
has  concluded  that  this  is  rarely  tie  case.  This  approach 
would  inevitably  lead  to  the  loss  of  packets  due  to  the 
gateways  inability  to  transmit  then  at  a  rate  comparable  to 
that  of  LAN.  R ecovery/retransiission  of  those  'lost' 
packets  to  the  gateway  «rould  be  left  to  the  lower  level 
protocols. 
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Another  method  through  whioa  congestion,  control  at 
the  FEP  could  be  accomplished  woald  be  through  the  addition 
of  buffers.  Packets  flowing  in  froa  the  LAN  could  be  queued 
in  a  buffer  for  subsequs.it  transaission  to  the  long  haul 
network.  Once  this  buffer  becomes  full,  packets  could  be 
discarded  as  in  the  first  method  or  a  signal  Df  some  type 
could  be  sent  throughout  the  network  indicating  that  the  DDN 
output  buffer  was  full.  Receipt  of  this  message  would  also 
imply  that  no  internetwork  traffic  should  be  sent  until  a 
message  is  received  from  the  gateway  indicating  that  the 
buffer  is  emDty  and  internetwork  traffic  transmission  can  be 
resumed. 

This  technique  oould  also  be  employed  with  two 
buffers.  Once  one  buffer  was  full,  it  would  be  disabled 
from  receiving  additional  packets  *aile  transaission  took 
place.  Simulta neoulsy,  the  second  buffer  could  be  filled 
and  it's  contents  transmitted  when  the  first  buffer  became 
empty.  While  the  second  ouffer's  contents  were  being  trans- 
mitted to  the  DDN,  the  first  buffer  would  be  receiving 
packets  from  the  LAN.  This  alternating  technique  could  be 
employed  with  N  buffers  ,  but  this  would  be  at  the  expense 
of  loosing  N  buffers  worth  of  memory  space  in  the  FEP.  This 
being  the  case,  a  limit  to  the  buffer  space  allocated  to 
internetwork  traffic  would  have  to  be  established.  with 
this  limited  buffer  space,  there  still  exists  the  possi- 
bility that  all  buffers  may  becone  full  simultaneously. 
This  would  require  incoaing  packets  to  be  discarded  or, 
notification  throughout  tie  network  that  buffers  are  full 
and  internetwork  packet  transmission  is  disabled. 

A  final  method  by  which  the  flow  of  traffic  from  the 
LAN  to  the  LHN  can  be  o^ntrolled  is  through  the  use  of 
external  storage  areas.  This  technique  is  vecy  similar  to 
the  buffering  methods  presented  abovs.  Buffers  are  utilized 
in  the  same  fashion  but,  *hen  they  become  full,   rather  than 
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discarding  packets  or  notifying  tha  network  of  the  buffers 
stare  ,  ail  incoming  packets  are  directed  to  external 
storage  areas.  When  the  buffers  bsgin  to  eapty,  packets 
currently  being  stored  are  directed  to  the  output  buffers  on 
a  FIFO  basis.  This  proceiure  reduces  congestion  on  the  LAN 
by  not  reguiring  the  contiiual  retransmission  of  packets  not 
previously  accepted  by  the  gateway.  Addionaliy,  it  elimi- 
nates the  need  for-  distributed  opponents  to  be  able  to 
recognize  a  "DDN  buffers  full  message"  and  carry  out  the 
internetwork  packet  restricting  actiDi  necessitated  by  it's 
receipt. 

2-   LHN  to  LAN  Packet  Control 

As  previoulsy  stated,  we  are  assuming  that  the  flow 
of  packets  between  the  FS?  half  of  tie  gateway  and  switching 
node  half  of  the  gateway  is  controllii  by  the  network  access 
protocols  supported  by  the  DDN.  Tiis  being  tha  case,  our 
discussion  is  restricted  to  answering  questions  such  as: 
'Should  we  transmit  each  packet  imnediately  onto  the  LAN 
upon  it's  receipt  from  the  DDN?',  or  'Should  we  employ  some 
buffering  techniques,  accumulating  some  packets  before 
transmitting  them  onto  the  LAN?1. 

It  is  the  author's  opinion  that  the  trickling  of 
packets  onto  the  network  one  at  a  tine  does  not  efficiently 
utilize  the  capabilities  of  a  10  Mbit/sec  LAN.  This  method 
reduces  network  throughput,  ana  requires  adaptors  and  compo- 
nents to  "wait"  longer  between  internetwork  packet  arrivals. 
3y  storing  the  internetwork  packets  in  buffers  or  dedicated 
external  storage  areas,  we  are  able  to  transmit  packets  onto 
to  the  LAN  in  bursts.  These  transn issins  can  occure  after 
an  entire  message  is  received  or  after  a  certain  number  of 
packets  have  accumulated  in  the  buffers  or  external  storage 
areas. 
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D.   ADDRESSING  AND  NAMING 

Whenever  any  two  devices  must  communicate  with  each 
other  and  they  are  not  dirsctly  connected  (i.e.  a  processor 
on  one  network  communicating  with  a  processor  on  another 
network) ,  the  question  of  addressing  the  proper  recipient 
becomes  a  major  consideration.  idlrsssing  across  network 
boundaries  requires  eitaar  a  standard  network  numbering 
scheme  or  a  means  of  address  translation  in  the  gateway 
[Ref.  28:  p.  4-49].  It  is  known  that  the  DDN  will  connect 
with  existing  networks  is  well  as  the  SPLICE  local  area 
networks.  It  is  the  author's  opinion  that  this  in  itself  is 
sufficient  to  justify  ;ae  establishment  of  a  standard 
numbering  scheme.  This  will  therefore  be  the  premise  upon 
which  our  discussion  will  be  based. 

Many  different  possible  internetwork  addressing  schemes 
exist.  The  CCITT  X.121  addressing  strategy  is  based  on  the 
telephone  network  system.  This  technique  allows  up  to  14 
digits  per  address.  rha  first  4  digits  are  a  destination 
network  identifier  cede  (DNICi  ,  foLlowed  by  the  remaining 
digits  which  may  be  usad  to  implement  a  hierarchical 
addressing  structure  [Ref.  29  p.  1*03].  The  DARPA  Internet 
has  implemented  a  common  address  format  across  all  networks 
it  connects  [Ref.  30:  p.  114].  lie  Internet  address  length 
is  fixed  at  32  bits.  These  bits  contain  the  address  of  a 
particular  network,  and  the  address  of  a  host  within  that 
network.  A  further  disaggregation  of  this  concept  might 
call  for  an  address  field  i hich  concaiaed  a  network  address, 
the  address  of  a  packet  switching/gateway  node  within  that 
network,  and  the  address  of  a  host  accessible  through  that 
node.  He  will  utilize  tha  addressing  technique  implemented 
in  the  Internet  as  the  basis  for  the  remainder  of  our 
discussion. 
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In  order  to  manage,  control,  ani  support  communications 
among  components  distributed  throughout  two  or  mora 
networks,  a  means  must  exist  for  explicitly  identifying  the 
components  involved  in  tie  communioat ion.  This  could  be 
accomplished  by  utilizeing  one  of  tie  addressing  strategies 
presented  above.  In  implea  entir.g  this  strategy,  rather  than 
requiring  the  user  to  be  aware  of  the  structure  of  the 
network  in  which  the  destination  host  resides,  a  naming 
convention  could  be  established  whici  relieves  him  of  indi- 
cating the  actual  address  of  the  desired  host.  A  naming 
convention  can  also  be  established  for  identifying  the 
network  to  be  accessed  rather  thai  requiring  a  specific 
address  to  be  provided. 

Assuming  an  operator  may  now  uss  names  to  identify  both 
the  destination  network  and  the  host  within  that  network, 
the  task  of  converting  these  to  actual  network  addresses 
must  be  considered.  Translation  of  the  network  name  to  a 
specific  network  address  will  ba  accomplished  by  the 
switching  node  through  which  a  SPLI3E  LAN  is  connected  to 
the  Defense  Data  Network.  Currently,  nodes  attached  to  the 
DDN  may  be  known  by  as  many  as  four  different  names 
[Ref.  31:  p.  111].  The  translation  of  a  local  host  name  to 
its  associated  address  and  vice  versa,  could  be  accomplished 
by  the  switching  node.  The  author  does  not  support  this 
approach  for  the  major  reason  that  the  switching  node  will 
most  probably  be  connecting  other  nstworks  and/or  hosts  to 
the  DDN.  For  it  to  possibly  perform  these  translations 
woild  mean  a  reduction  in  the  node's  capability  to  perform 
its  primary  functions  of  traffic  processing,  host  access, 
renting,  and  monitoring  and  control  [Ref.  31:  p.  33]. 
Therefore,  local  host  name  translation  must  be  performed  at 
the  local  level. 
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This  local  translation  capability  could  be  accomplished 
at  the  interface  between  a  distributed  component  and  the 
bus.  This  would  require  the  use  of;  additional  component 
resources  for  the  performance  of  a  function  which  could  most 
efficiently  be  implemented  at  a  centralized  location  (i.e. 
the  Front  End  Processor),  rather  than  at  each  individual 
component.  By  incorporating  the  Local  translation  capa- 
bility into  the  LAN's  FEP,  we  not  only  reduce  redundancy 
throughout  the  system,  but  also  facilitate  the  maintenance 
of  our  translation  tables.  The  final  issue  to  be  addressed 
is  concerned  with  the  place  (source  :r  destination  network), 
at  which  this  translation  occurs.  Translation  of  the  desti- 
nations name  can  either  occur  at  the  source's  gateway  or  at 
the  destination  network.  3y  delaying  translation  of  the 
name  to  an  address  until  arrival  at  the  destination,  we 
eliminate  the  requirenei t  for  each  gateway  to  possess 
specific  address  information  about  other  networks. 
Similarly,  the  translation  of  a  process  name  to  a  process 
address  would  also  be  accomplished  by  the  destination 
networks  FEP.  The  half  gateway  resident  in  each  SPLICE  FEP 
would  only  be  required  to  maintain  a  table  containing  the 
names  and  addresses  of  it's  local  components  aad  processes. 
Upon  receiving  a  packet  from  the  }DN,  the  component  and 
process  string  names  wouli  be  compared  against  entries  in 
the  address  translation  tables.  Appropriate  addresses  wouli 
replace  the  physical  node  name  and  process  name.  The  packet 
woild  then  be  ready  for  transmission  onto  the  local  bus. 

E.   ACCESS  CONTROL 

Access  control  is  concerned  with  establishing  mechanisms 
that  may  be  required  to  prevent  soms  traffic  from  entering 
ani  possibly  some  traffic  from  leaving  the  network.  This 
filtering  action  is  ideally  accomplished   by  the  gateway  two 
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networks.  Utilizing  our  mdel  of  a  "two  half-gateway",  each 
half  can  deal  with  controlling  access  to  the  network  that  it 
is  connected  to.  What  this  means  is  that  our  half  of  the 
gateway  in  the  LAN  FEP  :an  act  as  a  sentry  to  incoming 
traffic.  As  traffic  arrives,  the  "ID"  of  the  packet  (s)  can 
be  checked  against  a  table  containing  the  "names"  of  those 
packets  which  are  authorized  to  enter  the  LAN.  If  a  packets 
"ID"  appears  on  the  access  list,  sntry  is  granted,  if  not, 
the  sentry  may  either  discard  these  packets  or  possibly  send 
them  to  an  access  controller  [ief.  29:  p.  1401].  The  access 
controller  routine  can  taen  dynamically  enable  the  flow  of 
the  packets  into  the  network  after  performing  certain  checks 
on  the  packets  identity,  or,  it  may  decide  that  these 
packets  are  not  to  be  allowed  into  the  network,  discard 
them,  and  send  a  suitable  'canned'  response  to  the  source  of 
the  packet  (s)  letting  it  know  access  was  not  granted. 
Alternately,  it  may  inform  network  operations  personnel  of 
the  packets  that  wish  to  enter  tie  network  and  request 
action  to  be  taken. 

F.   OTHER  CONSIDERATIONS 

Two  additional  areas  of  concern  associated  with  the 
interconnection  of  two  networks  are  failure  notification  and 
accounting  procedures.  Assuming  tiat  failures  for  the 
connected  networks  are  detected,  identified,  and  isolated 
internally  by  each  networt,  the  question  arises  'How  is  the 
existance  of  a  failed  component  within  a  network  communi- 
cated to  those  in  other  networks  wid  may  wish  to  use  that 
component?'.  As  summing  that  both  the  LAN  configuration  data 
base  and  problem  managemeit  data  base  have  both  been  updated 
with  the  current  status  of  any  particular  failure,  the 
researcher  makes  the  following  propDsais  in  response  to  the 
previous  question.    Before   packets  are  let  into   the  local 
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area  network,  the  half-gateway  will  be  respDnsible  for 
checking  these  data  bases  to  insure  that  the  desired  desti- 
nation is  operational.  If  it  is,  and  assuming  the  access 
controller  has  permitted  access,  the  packets  are  transmitted 
into  the  network.  If  the  desired  lestination  is  currently 
inoperative,  a  response  iidicating  such  is  returned  to  the 
source.  Additionally,  if  a  source  from  another  network 
desires  to  check  the  status  of  an  element  within  a  SPLICE 
LAN,  it  should  have  the  capability,  just  as  a  local  user 
would,  cf  querying  either  one  of  thsse  data  bases.  Also,  it 
is  assumed  that  if  the  switching  node  through  which  a  SPLICE 
LAN  is  connected  to  the  D3N  fails,  then  the  responsibility 
of  reporting  the  inaccessibility  of  that,  particular  local 
area  network  lies  with  the  DDN  Monitoring  Center  who's 
juristiction  includes  the  failed  node.  Similarly,  the 
failure  of  a. LAN  FEP  which  makes  a  3PLICE  LAN  configuration 
inaccessible  will  be  reported  to  potential  network  users  by 
the  connecting  switching  node. 

It  is  the  researcher's  opinion  that  a  SPLICE  LAN  is  seen 
as  just  another  subscriber  to  the  DDtf.  This  being  the  case, 
there  seems  to  be  sufficient  justification  for  the  estab- 
lishment of  some  type  of  accounting  procedures  which  provide 
the  means  through  which  the  flow  of  packets  to  and  from  the 
DDN  can  be  monitored.  Assuming  some  type  of  accounting  will 
be  conducted  by  the  switching  node,  the  connected  LAN  could 
obtain  accounting  inf orna ticn  fron  it.  This  does  not 
provide  for  any  type  cf  cross  checking  of  the  switching 
node's  accounting  capability  or  accuracy.  This  then  estab- 
lishes the  requirement  for  soms  sort  of  accounting 
procedures  to  be  established  in  the  LAN's  half  of  the 
gateway.  Currently,  pubLic  packet  switching  networks  are 
using  procedures  which  account  for  subscriber  use  on  the 
basis  of  the  number  of  virtual  circiits  established  during 
the  accounting  period  and  the  number  of  packets  sent  on  each 
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virtual  circuit  [Ref.  29:  p.  1 4  00  3-  Only  slight  modifica- 
tions to  certain  reports  recoiimended  in  Chapter  3  would  be 
required  to  give  a  SPLI3E  local  area  network  a  similar 
accounting  capability.  .  Finally,  and  most  important  of  all, 
is  that  the  accounting  aeohanisms  implemented  by  the  SPLICE 
LAN's  be  based  upon  procedures  and  units  of  measure  iden- 
tical, or  very  similar  to  those  utilized  by  the  DDN. 

G.   CHAPTER  SUMMARY. 

We  began  this  Chapter  with  a  discussion  of  various 
configurations  that  a  gateway  between  computer  networks 
could  assume.  The  auno:  feels  that  the  "two  half-gateway" 
concept  offers  the  simplest  and  iDst  effective  means  of 
interconnection.  The  discussion  then  turned  to  the  problems 
associated  with  different  maximum  picket  sizes  utilized  by 
the  two  interconnected  networks.  We  looked  a't  the  situ- 
ations when  the  LAN  maximan  packet  size  was  greater  than  the 
long  haul  network  maximun  packet  size  and  vice  versa.  In 
both  cases,  suggestions 'were  male  as  to  how  this  problem 
could  be  handled.  A  discussion  of  flow  control  and 
congestion  control  technijies  was  then  entered  into.  This 
problem  was  approached  from  two  directions.  First,  control- 
ling the  flow  of  pactecs  into  the  LAN  half-gateway  for 
transmission  to  the  DDN.  And  second,  controlling  the  flow 
of  packets  from  the  Defense  Data  Network,  through  the 
gateway,  into  the  LAN.  The  prDDlems  of  internetwork 
addressing  and  component  naming  i/ere  then  considered.  The 
author  has  concluded  that  the  solution  to  the  first  problem 
would  be  the  establishment  of  a  standard  internetwork 
numbering  and  addressing  scheme.  The  standard  offered  was 
Of  the  form,  NETWORK  ADDEE3 S/LDCAL  HD5T  ADDRESS.  The  compo- 
nent naming  problem  was  found  to  be  best  handled  at  two 
levels.    The   translation  of  a   network  name  to   a  specific 
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network  address  would  be  conducted  at  the  switching  nods 
half-gatewa.y,  while  the  translation  of  a  local  host  name  to 
a  local  host  address  would  be  accomplished  by  the  destina- 
tion network  half-gateway.  We  then  briefly  discussed  the 
topic  of  access  control.  There,  we  looked  at  the  rola 
played  by  an  access  controller,  and  attempted  to  add  support 
foe  it»s  implementation  in  the  SPLICE  network.  Finally,  we 
looked  at  the  need  for  failure  notification  and  accounting 
capabilities  associated  with  int ernet work  traffic. 

Exclusive  of  the  interface  between  a  SPLICE  LAN  and  the 
DM,  the  monitoring  and  management  of  a  SPLICE  local  area 
network  is  predominantly  centralized.  Special  interface 
functions  such  as  those  lescribed  Ln  this  Chapter  require 
that  the  control  of  thesa  functions  be  distributed  to  the 
FSP .  Finally,  it  is  the  rasearcliar  •  s  opinion  that  the 
management  of  the  LAN/DDN  interface  mast  not  only  be  work- 
able, but  must  be  acceptaole  from  aa  operational  standpoint 
by  the  users,  and  from  a  technical  aid  logical  standpoint  by 
network  one  raters. 
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71.  LAN  CENTS AL  NQNIT33ING  SITE 

The  integration  of  those  managenent  tools  discussed  in 
Chapters  1-5  is  accomplished  by  the  local  area  network 
central  monitoring  site  (3MS).  It  is  here  where  measure- 
ments and  statistics  are  collected,  performance  analysis 
conducted,  diagnostic  programs  and  recovery  actions  initi- 
ated, network  utility  data  bases  updated,  and  where 
performance  parameter  adjustment  aessages  originate.  This 
process  of  managing  fron  a  central  location  minimizes  commu- 
nications and  synchronization  difficulties,  and,  helps  solve 
problems  that  may  otherwise  pass  unnoticed  [Ref.  33:  p.  21]. 

The  author  will  initially  present  what  he  feels  to  be 
ths  mission  of  a  central  aonitoring  site,  followed  by  appro- 
priately supportive  objectives.  The  manning  requirements 
and  organizational  structure  associated  with  a  3HS  will  then- 
be  discussed.  From, there,  a  discussion  of  a  network  opera- 
tor's workbench  will  be  entered  into.  Finally,  a  discussion 
of  a  network  operator's  rasponsibilities  under  both  normal 
and  failure  conditions  wiLL  be  presented. 

A.   MISSION  OF  A  LAN  MOriTDRINS  SITE 

The  mission  of  a  LAN  central  ioaitoring  sits  might  be 
stated  as  ,  'To  insure  tie  most  efficient  and  effective  use 
of  network  resources  and  to  maximize  network  availability, 
throughput  and  responsiveness1 .  Objectives  which  support 
the  accomplishment  of  this  mission  are: 

-•Keeping   track  of   the  status   and   configuration  of   the 
network. 

-  Detecting  alarm  conditions  and  failed  components. 

-  Carrying  out  fault  isolation  and  diagnostic  tests. 
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-  Contacting  appropriate  repair  personnel  aril  monitoring 
repair  activities. 

-  Altering  the  physical  md  logical  network  configurations 
and  documenting  such  alterations. 

-  Adjusting  component  performance  parameters. 

-  Generating  management  reports. 

-  Supporting  test  and  acceptance  activities. 

-  Provide  information  needed  for  planning  future  network 
evolution. 

-  Provide  a  historical  data  base  against  which  current  and 
future  network  performance  may  be  measured. 

-  Honitor  component  utilization  throughout:  the  network 
(e.g.  host,  communication  processor,  and  shared  resources 
utilization)  . 

-  Perform  a  scheduling  function  for  application  programs 
requiring  use  of  the  host  processors. 

The  first  eight  objectives  are  siailar  to  those  containd 
in  the  Program  Plan  for  the  Defense  Data  Network.  [Ref.  31: 
p.  142].  The  9th  aid  10th  itens  are  objectives  of  the 
Lawerence  Liver  more  Laboratory  Qctopus  Network  Monitoring 
and  Measuring  Project  [Ref.  34:  p.  2].  The  final  two  objec- 
tives are  a  product  of  the  author's  research. 

An  analysis  of  these  objectives  shows  that  the  tools 
discussed  in  this  paper  are  capable  3f  accomplishing  the  1st 
through  the  7th  and  the  11th  oojectives  directly.  Although 
not  mentioned  as  yet,  tha  central  nonitoring  site  must  be 
able  to  support  the  testing,  evaluation  and  acceptance  of 
new  components  that  are  to  become  part  of  the  local  area 
network.  3y  establishing  the  lata  bases  described  in 
Chapter  1,  we  ar eindirectly  supporting  the  accomplishment  of 
the  9th  and  10th  objectives.  The  establishment  of  data 
bases  which  record  network  performance  measures  and  compo- 
nent  utilization   statistics   would    greatly   enhance   our 
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ability  to  meet  these  objectives.  At  this  time,  the 
researcher  does  not  see  a  need  for  the  design  of  a  sched- 
uling algorithm  as  proposed  by  the  final  objective.  As 
applications  grow,  and  the  number  of  network  usees  increase, 
the  reguirement  for  establishing  a  scheduling  algorithm  for 
the  purpose  of  efficiently  and  fairly  assigning  jobs  to  host 
processors,  may  become  very  real.  An  example  of  where  a 
scheduling  algorithm  has  been  implemented  is  on  the  Los 
Alamos  Scientific  Laboratory  Integrated  Computer  Network 
[Ref.  35]. 

B.   MANNING  AND  0RGANIZATI3N  OP  A  LAS  CMS 

How  many  people  are  reguired  to  insure  the  continued  and 
efficient  operation  of  a  SPLICE  local  area  computer  network? 
Should  the  monitoring  site  be  manned  around  the  clock.?  What 
organizational  aspects  must  be  considered  with  the  addition 
of  a  central  monitoring  site?  These  are  the  guestions  that 
will  be  addressed  in  this  section.  During  the  discussion  of 
each,  a  possible  answer  will  be  reconmended  by  the  author. 

The  manning  proposed  for  the  DON  monitoring  centers 
range  from  four  people  at  the  system  monitoring  center,  from 
one  or  zero  at  other  centers  [Ref.  31:  p.  137].  The  manning 
of  the  NBSNET  (a  bus  oriented  local  computer  network)  meas- 
urement center  calls  for  '4  full-time  and  several  part-time 
computer-electrical  engineers  "  Ref.  35:  p.  13].  Neither  of 
these  manning  levels  seems  appropriate  for  a  SPLICE  LAN,  the 
DDK  being  a  much  larger  long  aaui  network,  and  the  NBSNET 
manning  level  reflecting  acre  of  ai  experimental  environ- 
ment. A  local  area  network,  with  a  structure  very  similar  to 
that  of  a  SPLICE  LAN  is  the  Hughes  Aircraft  Company  Janet 
network  [Ref.  37].  JANET  has  centralized  the  control  and 
monitoring  of  the  network  at  a  single  operator  position. 
From  this   position  the  operator   can  issue  commands   to  all 
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network  components,  perform  testing  and  performance  anal- 
ysis, detect  and  diagnose  failures,  and  reconfigure  the 
network.  The  degree  of  autoaatioa  recommended  throughout 
this  paper  would  provide  the  capabilities  required  for  a 
•one  person*  central  monitoring  sits.  If  these  and  other 
lower  level  functions  are  not  autoaated,  the  possibility 
exists  that  an  additional  operator  may  be  required.  The 
author  feels  that  substantial  processing  will  occur,  and 
that  file  transfers  between  Stock  Points  and  Inventory 
Control  Points  will  take  place  after  normal  working  hours. 
For  this  reason,  it  is  fait  that  a  network  operator  should 
be  available  at  anytims  processing  is  in  progress.  After 
normal  working  hours,  this  position  may  be  filled  as  a 
collateral  duty  (i.e.  ths  individual  filling  this  role  may 
also  be  responsible  for  one  of  the  host  processors) . 

In  answering  the  guestion  as  to  where  the  monitoring 
site  fits  into  the  organizational  picture,  one  aust  remember 
that  the  CMS  can  exerciss  a  great  ieal  of  control  over  the 
network  and  it's  componeits.  This  being  the  case,  those 
individuals  comprising  the  Cits  must  work  directly  for  the 
'Director  of  the  LAN'.  we  would  not  want  the  central  moni- 
toring site  to  come  unier  the  control  of  one  of  it's  users 
or  under  the  control  of  one  of  the  staffs  associated  with  a 
network  host.  This  seeas  to  add  acre  justification  for 
establishing  the  position  of  'Director  of  the  LAN'.  This 
Director  could  operate  oat  of  the  central  monitoring  site. 
From  here,  he  could  manage  and  control  the  operations  and 
resources  of  the  local  network,  formulate  network  policy, 
and  see  to  it's  implementation. 
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C.   A  NETWORK  OPERATOR'S  &DRKBENCH 

A  network  operator's  workbench  is  a  single,  integrated 
system  containing  all  tha  operator's  tools  in  one  place. 
The  system  must  be  interactive  and,  because  new  analysis 
packages  and  models  will  be  continuously  developed,  possess 
the  characteristics  of  a  programmer's  workbench  [  Ref .  21:  p. 

Certain  hardware  assets  fill  enable  the  operator  to 
better  carry  out  the  network  management  function.  The 
terminal  or  terminals  utilized  by  the  CMS  should  have  a 
fairly  extensive  graphics  capability.  For  example,  a 
display  cf  the  entire  network  couli  be  put  on  the  screen 
with  different  colors  iidicating  the  status  of  various 
components.  A  dedicated  printer  wiLl  be  needed  for  manage- 
rial reports,  but  more  importantly,  for  the  recording  of 
failure  messages  received  by  the  CMS.  Adequate  direct 
aczess  storage  will  also  be  a  necessity.  Additionally,  an 
alarm  capability  for  indicating  the  breeching  of  established 
parameter  threshholds  will  be  required. 

There  exists  numerous  software  tools  that  can  be 
utilized  by  the  network  operator.  3ne  of  the  most  important 
is  a  good  DBMS  with  a  complete,  user  friendly  query 
language.  This  asset  will  allow  the  operator  to  investigate 
relationships  between  performance  measurements  and  associ- 
ated parameters,  and  to  as<  exploratory  questions  concerning 
the  effect  of  certain  network  configurations  on  performance 
criteria.  The  possession  of  a  word  processing  capability 
will  also  assist  the  operator  in  the  performance  of  his 
duties.  An  additional  software  asset  is  the  actual  process 
through  which  the  operator  interacts  with  these  tools.  This 
interface  may  be  througi  a  Network  Operating  System  as 
currently  planned  for  the  DM  [Ref.  11]  and  described  in 
[Ref.  32].    Another  approach  to  this  problem  is  to  have  the 
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operator  interact  with  the  software  tools  through  an  appli- 
cation program.  The  Hughas  Aircraft  Company  has  implemented 
a  Network  Monitor  Program  for  it's  JANET  network  which  runs 
as  an  application  program  on  one  of  the  host  processors 
[Ref.  37:  p.  96].  The  distributad  counterpart  of  this 
central  control  program  is  a  'background'  program  included 
as  part  of  the  adaptor  microcode.  It  is  through  these  back- 
ground programs  that  the  CMS  receives  certain  measurements 
and  failure  messages.  Additional  software  tools  that  will 
be  of  help  to  the  operator  include:  an  English  language  set 
of  commands  for  ease  of  system  operation  and  network  diag- 
nostics, default  parameter  value  establishment  if 
unspecified  by  the  operator,  dynamic  control  programs  for 
adjusting  lower  level  performance  pararaenters  in  accordance 
with  network  conditions,  and  finally  a  system  which  exists 
for  prompt  and  accurate  collection  of  any  data  the  user  may 
provide  on  a  problem. 

D.   OPERATORS  ACTIONS:  NORMAL  CONDITIONS 

In  the  next  two  sections  wa  will  attempt  to  identify  the 
responsibilities  of  a  network  operator  under  both  normal  and 
abnormal  conditions.  Thay  are  presented  here  in  an  effort 
to  establish  a  basic  set  of  responsibilities  for  all  SPLICE 
LAN  operators.  This  saction  daaLs  with  the  operator's 
responsibilities  under  nocaal  conditions.  It  is  realized  by 
the  author  that  some  of  these  responsibilities  may  also 
pertain  to  failure  conditions.  Finally,  it  is  not  known 
which,  if  any,  of  the  identified  responsibilities  will  be 
automated.  therefore,  the  discussion  of  responsibilities 
will  be  presented  as  if  the  operator  had  to  take  some 
specific  action  for  it's  accomplishment. 
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1  •   Initialization 

Assuming  the  network  operator  has  just  invoked  the 
network  management  control  program,  there  are  certain  func- 
tions that  must  be  aoo omplished.  The  operator  must 
establish  a  connection  with  the  'background1  program  in  the 
adaptor  or  component  interface.  Anong  other  things,  this 
will  enable  him  to  find  out  just  who  is  on-line-  Once 
connections  are  establishad,  th=  operator  can  send  out 
instructions  to  the  nodes  providing  them  with  guidance  as  to 
what  measurements  to  take,  when  to  send  them  to  the  CMS,  and 
upooming  maintenance  acitvities.  Also  luring  this  time,  the 
network  operator  obtains  the  physical  and  logical  configura- 
tion tables  for  each  host  and  communication  processor,  which 
is  then  stored  in  the  a  global  network  configuration  table. 
During  initialization  the  network  operator  also  sets 
performance  parameter  values,  establishes  alarm  threshhclds 
for  performance  measurements,  identifiss  critical  components 
which  he  is  specifically  interacted  in  monitoring  and 
updates  the  Name/Address  Table  in  the  FEP  half-gateway. 

2.   Utility  Data  Bases 

Information  obtained  from  each  component  about 
itself  and  it's  associate!  peripherals  is  used  to  update  the 
network  configuration  data  base.  This  provides  the  operator 
with  a  view  of  the  physical  state  of  the  network. 
Adi itionally,  logical  conf iguratioa  tables  can  be  estab- 
lished for  each  component  and  user,  which  gives  them  their 
own  'customized'  configuration  of  the  network.  Also  during 
this  time,  problem  management,  change  management  and 
performance  analysis  data  bases  may  be  opened  for  read/write 
and  checked  for  items  of  interest.  Finally,  the  operator 
ne=ds  to  communicate  with  the  DDN  Monitoring  Center  who's 
area  of  influence  the  LAN  fails  within.     This  interaction 
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may  simply  be  the  transfer  of  the  current  DDN  status  file  to 
ths  CMS.  This  file,  can  then  bs  used  to  assist  users 
attempting  internetwork  con municatioa. 

3.  Operator  's   Displays 

The  network  operator  is  responsible  for  monitoring 
various  network  status  displays  and  in  some  cases  insuring 
their  availability  to  users.  These  status  displays  are 
created  from  data  obtained  from  coi figuration  and  problem 
management  data  bases  in  addition  to  results  of  performance 
analysis  and  component  monitoring.  fcs  a  minimun,  the  status 
displays  that  should  exist  include:  a  global  network  status 
display,  displays  for  eaci  major  oonponent  with  appropriate 
operating  information,  displays  for  any  desired  network 
performance  parameters  such  as  throughput  and  response  time, 
a  general  information  display  for  informing  users  of  sched- 
uled maintenance,  DDN'  s  status  and  administrative 
activities,  and  a  display  for  depicting  load  information  on 
hosts  and  communication  processors. 

4.  Normal   Management   Activities 

In  addition  to  those  activities  mentioned  above,  the 
network  operator  is  also  responsible  for  the  accomplishment 
of  other  normal  management  activities.  He  lust  initiate 
monitoring  periods  for  tie  collection  of  measurement  data. 
Upon  the  completion  of  the  monitoring  period  he  must: 
control  the  transfer  of  data  from  the  adaptors  to  the  CMS, 
disable  adaptors  from  taking  additional  measurements,  and 
clear  adaptor  memory  contents  if  so  reguired. 

Utilizing  data  gathered,  aid  statistics  generated 
during  the  monitoring  period,  the  network  operator  must 
insure  the  appropriate  data  bases  are  updated.  This  may 
include  modifications  to  the  conf igaration ,  problem  manage- 
ment,  and  performance  analysis   data  bases.     Information 


91 


obtained  during  the  monitoring  perioi  is  also  to  be  used  by 
the  network  operator  to  identify  trends,  look  for  bottle- 
necks in  the  network  (especially  it  the  DDN/LAN  interface), 
conduct  network  performance  analysis,  and  prepare  network 
status  and  utilization  reports.  While  analyzing  results  of 
the  monitoring  period,  the  operator  nay  become  aware  of  some 
pending  component  failure.  If  so,  appropriate  action  is 
taken  to  diagnose  and  correct  the  failure.  More  specific 
action  to  be  taken  upon  failure  detection  will  be  discussed 
in  the  next  section. 

Other  normal  management  acti/ities  include  the  oper- 
ators responsibility  to  test  ail  adaptors  failure  detection 
ani  diagnostic  capabilitiss ,  the  distribution  of  new  soft- 
ware versions,  adjustment  of  network  logical  and  physical 
configurations,  and  adjisting  performance  parameters  in 
order  to  tune  the  network.  The  network  operator  is  also 
responsible  for  inforaiig  and  cDsrdinating  with  users 
planned  maintenance  acti/ities.  Dae  final  responsibility 
calls  for  CMS  personnel  to  be  involved  in  the  installation, 
testing,  and  acceptance  of  equipment  that  is  going  to  become 
part  of  the  network. 

E.   OPERATORS  ACTIONS:  COMPONENT  FMLtJRE 

Having  utilized  the  performance  analysis,  and  problem 
detection  and  diagnosis  technigues  pcasented  earlier  in  this 
paper,  let  as  assume  the  network  operator  has  identified  a 
failed  component.  What  then  are  the  procedures  that  must  be 
followed  in  orfler  to  manai*  this  failare  until  it's  rectifi- 
cation? Assuming  the  failare  is  of  aajor  significance,  such 
as  a  down  communication  processor  ,  one  of  the  first  things 
the  operator  should  do  is  notify  the  network  of  the  failed 
coaponent.  Concurrently,  configuration  tables,  the  problem 
management  data  base,   ani  the  Name/\ idress  TabLe  in  the  FEP 
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should  be  updated.  Appropriate  entries  for  the  problem 
management  and  configuration  data  bases  are  shown  in 
Appendix  A  and  Appendix  B  respectfully.  Having  done  this, 
the  operator  may  utilize  some  fori  of  DDT  as  discussed  in 
Chapter  4,  in  an  attempt  to  oorrect  or  further  isolate  the 
cause  of  failure.  If  this  fails,  the  operator  can  utilize 
the  information  that  has  been  recorded  in  the  network 
history  file  to  try  and  'backup'  the  processor  to  a  point 
before  the  failure  occured  and  attempt  a  restart.  The  last 
chance  the  operator  has  to  correct  the  problem  is  to  dump 
the  suspected  failure  causing  software  to  off-line  storage, 
ani  reload  the  system  witi  a  fresh  oopy  of  the  appropriate 
software.  Having  exhausted  his  means  of  problem  correction, 
the  operator  is  responsible  for  contacting  the  appropriate 
ven  dor. 

During  the  course  of  problem  identification  and  correc- 
tion, it  is  required  that  the  network  remain  available  for 
customer  use.  To  do  this,  the  network  operator  must  have 
the  capability  of  reconfiguring  the  network,  disable 
processing  of  local  operator  requests  so  that  he  is  in  full 
control  of  the  network,  activate  and  deactivate  a  components 
connection  to  the  bus,  ani  transfer  functions  performed  by 
the  failed  component  to  another  devioe  capable  of  performing 
that  function.  Although  the  performance  of  the  network 
during  this  time  will  not  be  optimun,  it  will  at  least  be 
able  to  support  some  processing  requirements.  Upon  failure 
correction,  the  operator  is  responsible  for  bringing  the 
system  back  to  a  state  of  normal  operation.  This  wouli 
include  updating  the  appropriate  data  bases,  returning  of 
functional  respc nsiblities  as  required,  and  notifying  users 
of  the  resumption  of  normal  services. 
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F.   CHAPTEB  SOHMAHI 

In  this  Chapter,  we  began  by  presenting  the  mission  of  a 
network  central  monitoring  site  and  the  objectives  to  be  met 
in  order  to  fulfill  that  mission.  A  discussion  of  the 
manning  and  structural  aspects  of  a  CMS  was  then  entered 
into.  Attention  was  thai  focused  Dn  the  description  of  a 
network  operators  workbench  and  its'  associated  tools.  Our 
final  discussion  dealt  with  the  ideitif ication  of  a  network 
oDerator's  responsibilitirs  under  bsth  normal  and  failure 
conditions. 

It  is  the  author's  opinion  that  the  mission  and  objec- 
tives presented  at  the  beginning  of  this  Chapter  provide  a 
complete  and  succinct  picture  of  exactly  why  a  network 
central  monitoring  site  exists  and  what  services  it  must 
provide  for  the  network.  The  researcher  recommends  that  the 
monitoring  of  the  network  be  automated  to  a  point  such  that 
only  one  operator  and  his  staff  are  required  to  'control* 
the  network.  It  is  also  felt  that  the  position  of  'Director 
of  the  LAN'  be  established  as  part  of  the  CMS  with  authority 
over  ail  aspects  of  networ<  utilization.  The  tools  recom- 
mended as  part  of  the  operator's  workbench  are  seen  as  the 
basis  upon  which  network  nonitoring,  control,  and  management 
will  be  conducted.  Without  them,  the  accomplishment  of  the 
central  monitoring  site's  nission  will  be  questionable.  To 
conclude,  it  is  the  researcher's  opinion  that  the  network 
operator's  responsibilities  we  ha/e  identified  in  this 
Chapter,  although  not  ail  encompassing,  can  be  used  as  a 
basis  upon  which  extended  and  nore  specific  requirements  can 
be  built. 
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|PPENDIX    & 
PROBLEM    MMASEMENr    REC3RD    ENTRIES 

Time    and   date  of   problem   awareness 

Type   of  equipment  and  serial  number 

Remarks   about   the  nature    of    the    probLem 

Logical  name   of    the    affected   nstwork    element 

Target   date   and    time    for   problem    resolution 

Current   problem    status 

Aseesment   of   problems   impart    on    network    components 

Cross      reference      to      appropriate        entry      in      configuration 

management    database 

Physical      location      of      problem    occtirance      and      of      elements 

reporting  the   problem 

Date    and   time    of    problem    resolution 

Point    of   contact      and   phone    number   tirough      which    additional 

information   concerning   the    problem    can    be   obtained 
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APPENDIX  B 
CONFIGURATION  MANAGEMENT  RECORD  ENTRIES 

Item  of  equipment 

Model  and  serial  numbers 

Physical  address 

Components  logical  name 

Rental/Purchase  price 

Depreciation  information 

Instalied/uninst ailed  status 

Order  number 

Ship  date 

Lease    start    and    end    lease    dates 

Vendor   name,    phone    number,     and   address 

Associated   node    logical   nine 

Itam   description/function 

Remarks 

Point  of  contact;  name,  lDsation,  telephone  number 

Building  and  room  piece  of  equipment  is  in 

Machine  software  was  executing  on  *han  problem  *as  detected 
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